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ABSTRACT 

We  describe  a  new  Maximum  Entropy  pole-zero  spectrum  estimation  method. 
The  model  is  designed  to  achieve  the  maximum  possible  entropy  subject  to  con¬ 
straints  on  the  first  few  correlation  and  cepstral  values.  The  solution,  which  is 
in  die  form  of  an  ARMA  model,  is  based  on  solving  a  generalized,  symmetric, 
almost-Toeplitz  eigenvalue  problem.  We  characterize  the  existence,  unique¬ 
ness,  stability  and  minimum  phase  properties  of  the  ^otution,  and  categorize  all 
possible  occurrences  of  canceling  pole-zero  pairs.  A  search  procedure  based  on 
a  fast  Levinson-like  algorithm  is  given  for  estimating  die  model,  and  examples 
are  presented  to  illustrate  its  performance.  A  special  case  of  die  method  gives 
a  model  estimate  similar  to  that  of  Pisarenko's  harmonic  retrieval  problem. 

August  30,  1985 

Submitted  to  IEEE  Transactions  on  Acoustics,  Speech,  Signal  Proc. 

Permission  to  publish  this  abstract  separately  is  granted. 

EDICS  Category:  3.1.3.1  Parametric  Spectral  Analysis 


*This  work  has  been  supported  in  part  by  the  Advanced  Research  Projects  Agency  monitored  by 
ONR  under  Contract  N00014-81-K-0742  NR-049-506  and  in  part  by  the  National  Sdenoe  Foundation 
under  Oram  ECS80-07102. 

*Now  working  at  Sanders  Associates,  Nashua,  New  Hampshire 


Maximum  Entropy  Pole-Zero  Estimation 

Bruce  R.  Musicus  1 
Allan  M.  Kabel 2 

Room  36-797 

Research  Laboratory  of  Electronics 
Massachusetts  Institute  of  Technology 
SO  Vassar  St. 

Cambridge,  Mass.  02139 
Phone:  (617)253-8845 


1.  Introduction 

,\ 

J  Maximum  Entropy  has  been  suggested  by  numerous  authors  as  a  good  objective 
measure  for  "optimally"  modeling  die  power  spectrum  of  a  wide-sense  stationary  ran- 
dom  process.  In  the  original  Maximum  Entropy  Spectral  Analysis  (MESA)  formula¬ 
tion  of  Burg[l,2],  the  power  spectrum  P  («•*“)  is  chosen  by  maximizing  the  entropy 
function  subject  to  constraints  on  several  of  the  correlations  of  the  model.  Jaynes[3,4] 
has  argued  that  the  resulting  maximum  entropy  model  accurately  describes  the  avail¬ 
able  information,  but  is  maximally  non-committal  with  regard  to  the  unavailable  infor¬ 
mation.  In  the  case  where  the  constraints  are  placed  on  a  set  of  uniformly  spaced 
correlations  of  a  one-dimensional  stationary  process,  Maximum  Entropy  analysis  leads 
to  an  all-pole  model  whose  coefficients  may  be  found  by  a  fast  Levinson  recursion  algo- 
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rithm.  Modifications  of  the  procedure,  such  as  the  covariance  method,  die 
forward/backward  covariance  method,  and  so  forth,  have  been  explored  in  depth  by 
numerous  authors[5, 6, 7]. 

fit  this  paper,  we  consider  a  generalization  of  the  Maximum  Entropy  Method  of 
spectral  estimation  (MEM)  in  which  we  find  the  power  spectrum  with  the  largest 
entropy  which  matches  both  a  set  of  correlation  and  a  set  of  cepstral  values.  Lagunas- 
Hernandez  et.  al.  [8]  first  showed  that  these  constraints  lead  to  an  Autoregressive 
Moving-Average  (ARMA)  model  for  the  power  spectrum.  Unable  to  solve  for  the 
pole  and  zero  polynomial  coefficients,  however,  they  used  an  approximate  solution 
technique  having  suboptimal  performance,  hi  this  paper,  we  solve  this  problem  exactly 
for  the  case  of  a  one-dimensional  process,  with  uniformly  spaced  correlation  and  cep¬ 
stral  lags  centered  about  zero.  We  transform  the  problem  into  an  equivalent  general¬ 
ized  real  symmetric  almost-ToepHtz  eigenvahie/eigenvector  problem.  The  entropy  of 
the  model  is  related  to  the  largest  eigenvalue  of  this  problem,  and  the  pole  polynomial 
is  the  corresponding  eigenvector.  The  zero  polynomial  is  then  found  by  a  simple  recur¬ 
sion.  The  formulas  are  similar  to  the  mixed  first  and  second-order  modeling  procedure 
suggested  by  MulHs  and  Roberts[9].  Except  for  possible  pole-zero  cancellation,  the 
pole  polynomial  is  guaranteed  to  be  stable.  If  the  model  we  calculate  is  strictly 
minimum  phase,  then  we  show  that  it  solves  the  constrained  Maximum  Entropy  prob¬ 
lem.  Otherwise,  we  show  that  there  does  not  exist  any  finite,  strictly  positive  power 
spectrum  which  exactly  maximizes  the  entropy.  We  speculate  in  this  case,  however, 
that  our  model  is  the  weak  limit  of  a  sequence  of  minimum  phase  models  which  match 


all  given  constraints,  and  which  asymptotically  achieve  the  maximum  possible  entropy. 

Because  the  eigenvector  problem  has  an  almost-Toeplitz  structure  with  displace¬ 
ment  rank  3,  there  exists  a  fast  Levinson-like  algorithm  for  finding  the  eigenvector 
once  the  eigenvalue  is  known.  This  same  algorithm  can  be  used  iteratively  to  help 
search  for  the  maximum  eigenvalue.  A  side  benefit  of  this  approach  is  that,  at  least  in 
theory,  we  can  detect  when  the  model  has  canceling  pole-zero  pairs  caused  by  choosing 
an  excessively  high  model  order. 

When  the  number  of  known  cepstra  is  equal  to  or  larger  than  the  number  of 
known  correlations,  and  all  known  cepstra  have  value  zero,  then  we  show  that  our 
MEM  model  yields  the  same  answer  as  Pisarenko’s  harmonic  retrieval  algorithm[10] 
We  conclude  with  several  examples  illustrating  the  performance  of  this  pole-zero  max¬ 
imum  entropy  estimator  on  a  variety  of  simulated  data. 

2.  Derivation  of  an  ARMA  Model 

Suppose  we  observe  a  segment  of  N  data  samples  x  [0],  .  .  .  ,  x  [N  - 1]  drawn  from 
a  zero-mean,  stationary,  ergodic,  complex-valued  Gaussian  random  process  with  unk¬ 
nown  power  spectrum  P(z).  (Though  we  only  consider  complex-valued  data,  the 
development  for  real-valued  data  would  be  virtually  identical.)  The  correlations  R  [n] 
of  this  process  are  defined  by: 


*[n]  =  E  |  x  *[k  Jx  [k  +  n  ]  ] 


for  any  k 


(2.1) 
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=  J  P(eJm)eJmn  ^  (2  2) 

—n 

The  power  spectrum  P{z)  of  this  process  is  the  z-transform  of  the  correlations: 

P(z)  =  2  *[»]*""  (2.3) 

It  =  -X 

We  define  the  cepstrum  of  the  process  x  [» ]  as: 

ir 

c[n]=  flog  (2.4) 

— n 

Using  the  available  time  series  data,  we  would  like  to  estimate  the  power  spectrum 
P(z).  The  classical  Blackman-Tukey[ll]  approach  to  power  spectrum  estimation  first 
estimates  the  correlations  from  the  data  by  a  formula  such  as: 

=  77*  2  **(*M*+"1  (2.5) 

n  k=o 

Unfortunately,  given  only  the  available  segment  of  N  data  points,  it  will  not  be  possi¬ 
ble  to  directly  estimate  the  correlations  beyond  lag  AT- 1.  Furthermore,  for  n  close  to 
N,  only  N  -  \n  |  terms  are  available  in  the  sum  for  estimating  R  [n ],  and  so  these  esti¬ 
mates  will  have  large  variance.  The  classical  power  spectrum  estimation  approach  mul¬ 
tiplies  the  estimated  correlations  by  a  tapered  window  which  attenuates  the  high  order, 
unreliable  lags.  Applying  a  Fourier  Transform  then  gives  a  smoothed  power  spectral 
estimate.  Increasing  the  lag  window  length  improves  the  spectral  resolution,  but  also 
increases  the  variance  of  the  spectrum. 

Burg[l,  2]  suggested  an  alternative  procedure  for  power  spectrum  estimation  which 
often  achieves  higher  resolution  with  less  variance  than  the  classical  estimates.  Assume 


that  our  estimates  of  the  first  p  +  1  low  order  correlations  R  [0],  ...,/?  \p  ]  are  exactly 
correct,  but  assume  that  the  remaining  correlations  are  unknown.  In  general,  there 
will  be  an  infinite  number  of  power  spectra  which  match  the  known  correlations.  Burg 
suggested  that  a  reasonable  estimation  approach  would  be  to  choose  the  power  spec¬ 
trum  which  matches  the  known  correlations,  but  otherwise  has  maximum  entropy  H . 
For  a  stationary  Gaussian  random  process,  the  average  entropy  can  be  shown  to  be 
proportional  to: 

ir 

H  =  f  log  (2.6) 

— ir 

Maximizing  this  entropy  formula  subject  to  the  correlation  constraints  yields  an 
Autoregressive  (AR)  model  for  the  power  spectrum.  When  the  data  consists  of  uni¬ 
formly  sampled  correlations  of  a  one-dimensional  process,  this  model  is  easily  calcu¬ 
lated  by  a  fast  Levinson  recursion  algorithm  which  is  guaranteed  to  yield  a  stable  all¬ 
pole  polynomial  estimate.  Numerous  studies  have  proven  the  high  resolution  capabili¬ 
ties  of  this  spectrum  modeling  approach,  and  have  shown  that  the  spectral  estimate  is  a 
good  model  of  the  envelope  of  the  actual  power  spectrum[6]. 

In  this  paper  we  consider  a  straightforward  extension  to  Burg’s  Maximum  Entropy 
procedure  which  leads  to  an  Autoregressive  Moving  Average  (ARMA)  model.  Sup¬ 
pose  that  we  know  the  exact  values  of  the  first  p  +  1  correlations  of  the  process, 
R  [0],  .  .  .  ,R  fp].  Also  suppose  that  we  know  the  exact  values  of  the  first  q  values  of 
the  cepstrum  of  the  process,  c[l],  .  .  .  ,  c[?].  (Note  that  the  entropy  is  equal  to  the 
zeroth  cepstral  coefficient  c  [01;  this  is  why  we  assume  that  we  know  only  the  values 
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c  [1] . c  [q  ].)  The  additional  cepstral  information  should  be  useful  in  generating  a 

much  higher  quality  spectral  estimate  than  we  could  achieve  with  just  correlation  data 
alone.  For  example,  in  applications  such  as  the  analysis  of  voiced  segments  of  speech, 
we  would  like  good  estimates  of  the  envelope  of  the  signal  power  spectrum,  without 
the  fine  structure  introduced  by  the  voiced  excitation.  It  is  well  known  that  the  low 
order  cepstral  coefficients  accurately  reflect  this  envelope  information  without  much 
degradation  from  the  impulse  train  excitation.  This  cepstral  property  has  been  exten¬ 
sively  exploited  in  homomorphic  systems  such  as  those  introduced  by  Kopec, 
Oppenheim  and  Tribolet[12],  and  Yegnanarayana[13). 


We  now  calculate  the  power  spectrum  with  the  largest  entropy  (2.6)  but  which 
satisfies  the  constraints  (2.2)  and  (2.4).  Recognize  that  =  R*[n]  and 

c[-n]  =  c *[n ].  Introduce  Lagrange  multipliers  {0*}**-^  and  =  and  form 
the  Lagrangian  L  by  adding  multiples  of  the  constraints  (2.2)  and  (2.4)  to  the  entropy 


+  t  H  J 
1*1  =  1  - 


J  log jr  -  C[k] 


Suppose  that  a  solution  to  the  Maximum  Entropy  problem  exists,  P(e^a>),  which 
exactly  matches  the  correlations  and  the  cepstra,  and  which  is  finite  and  strictly  posi¬ 


tive.  Then  by  the  Euler-Lagrange  Multiplier  Theorem[14]  this  solution  must  be  a 


critical  point  of  the  Lagrangian  with  respect  to  variations  in  P{eJOi)  and  variations  in 
the  multipliers  0*  and  p.*  .  Using  variational  calculus,  therefore,  a  necessary  condition 
which  the  Maximum  Entropy  solution  must  satisfy  is: 


o  =  |t?—  +  £  0*  +  £  VLk - ^5— 

P(eJ»)  |*|=o  |*|-l  '  P(en 


(2.8) 


(Loosely  speaking,  we  differentiate  L  with  respect  to  each  point  of  the  function 
P  (e-i™),  and  set  the  derivative  to  zero.)  Solving  (2.8)  shows  that  if  a  strictly  positive 
solution  exists  to  the  Maximum  Entropy  problem,  Pp  q (eJW)  >  0  for  all  w,  then  it 
must  have  the  form  of  a  rational  polynomial,  or  in  other  words,  an  Autoregressive 
Moving  Average  (ARMA)  model: 


w*)  - 


f  v* 

k=-P 


(2.9) 


The  Lagrange  multiplier  coefficients  {0*}  and  {p.*}  must  be  chosen  so  that  the  con¬ 
straints  (2.2)  and  (2.4)  are  satisfied.  It  is  well  known  that  linear  time  invariant  systems 
satisfying  finite  order  difference  equations  are  naturally  described  by  ARMA  models 
[IS];  thus  this  approach  to  spectral  estimation  has  the  potential  of  yielding  accurate 
models  for  many  physical  systems. 


3.  Determining  the  ARMA  Model's  Coefficients 


This  MEM  modeling  technique  was  first  published  by  Lagunas- Hernandez,  et 
a/[8].  In  this  original  paper,  however,  only  approximate  solutions  for  the  model 
parameters  {0*,p.*}  were  suggested.  We  have  found,  however,  that  it  is  possible  to 
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transform  this  problem  into  the  form  of  a  generalized  eigenvalue/eigenvector  problem 
which  is  comparatively  easy  to  solve. 

Assume  that  a  solution  P  (e^)  exists  to  our  constrained  Maximum  Entropy  prob¬ 
lem  which  satisfies  the  Paley- Wiener  condition  for  discrete  signals:[16] 


/  |  log /•(«■'") 


—IT 


d<a 

2ir 


<  00 


(3.1) 


Then  we  can  always  factor  the  power  spectrum  into  a  product  of  a  gain  y2  times  a 
minimum  phase  factor  G  (z)  times  a  maximum  phase  factor  G  *(l/z  *): 


P(z)  =  -,2G(2)G*(l/2*)  (3.2) 

where  G(z)  is  causal  and  stable,  with  leading  coefficient  of  1,  and  with  a  causal  and 

stable  inverse. 


G(z)  =1+2  g[n\z  n  (3.3) 

/i  =  1 

We  define  g  [0]  =  1,  and  g  [n  ]  s  0  for  n  <0.  Because  the  MEM  power  spectrum  (2.9) 
must  be  a  finite  order  rational  polynomial,  G  (z )  can  be  factored  into  a  ratio  of  two 
minimum  phase  polynomials: 


G(z)  = 


*«(») 

Ap(z) 


where: 


(3.4) 


V*)  =  1  +  1  anj> 2  "  (3.5) 

«=l 

*,(o  =  i+  i 

n=l 


(3.6) 


All  the  roots  of  Ap(z)  and  Bq{z)  must  be  inside  the  unit  circle.  It  is  convenient  to 
define  a^j,  **  1  and  q  =  1. 

Oppenheim  and  Schafer[17]  derived  a  simple  formula  for  minimum  phase  models 
which  relates  the  cepstral  coefficients,  c  [n  ],  to  the  coefficients  of  the  impulse  response, 
g  [n  ].  To  derive  this  relationship,  take  the  log  of  (3.2): 


2  c[n]z  "  =  logy2  +  logG(z)  +  logG  (1/z  ) 

n  =-oe 


(3.7) 


Since  G(z)  is  minimum  phase,  the  polynomial  log  G(z)  is  right  sided  with  non-zero 
coefficients  starting  at  lag  1.  Similarly,  logG*(l/z*)  is  a  left  sided  polynomial  starting 
at  lag  -1.  Matching  terms  with  equal  powers  of  z  on  both  sides  of  (3.7)  then  gives: 


logG(z)  =  2  " 

/i  =1 


(3.8) 


c[0]  =  H  =  logy2  (3.9) 

Equation  (3.8)  specifies  a  mapping  between  the  coefficients  of  the  minimum  phase 

polynomial  G(z)  and  the  cepstral  coefficients  c[n].  Differentiating  both  sides  with 

respect  to  z -1  gives: 

OC  1  X 

n  r  1  — *+1  n  r  1  —  »+1  — 


2  »c[nlj-»+1  =  2  (3.10) 

11=1 

Multiplying  (3.10)  by  G(z)  and  equating  terms  with  equal  powers  of  z  on  both  sides 
of  the  equation  thus  gives  a  recursive  formula  for  the  coefficients  g  [n  J  in  terms  of  the 
cepstral  coefficients  c  [n  ]: 


-;5 


g [«]  =  —  2  *c[*l*[®  ~*1  for  n  ~  1»2»— 

n  *=  l 

Thus  knowing  the  cepstral  coefficients  c  [1],  .  .  .  ,  c  [q  ]  is  exactly  equivalent  to  knowing 
the  first  4  +  1  coefficients  of  the  minimum  phase  factor,  g [0],  ■  ■  .  ,g[q\  (where 
£[0]- 1).  Furthermore,  the  entropy  H ,  which  is  equal  to  the  zeroth  cepstral  coefficient 
c[0],  is  just  the  log  of  the  model  gain  y2.  We  may  conclude  that  maximizing  entropy 

subject  to  knowledge  of  the  correlations  R [0] . R\p]  and  cepstra  c [1],  .  .  .  ,  c [q ] 

is  thus  equivalent  to  finding  a  minimum  phase  pole-zero  model  with  p  poles  and  q 
zeroes,  which  has  the  largest  possible  gain  y2,  and  which  also  matches  the  correlations 
A[0],  .  .  .  ,R[p]  and  impulse  response  coefficients g [0],  .  .  .  ,g[q\. 

Now  to  find  the  pole  and  zero  coefficients.  Multiplying  both  sides  of  (3.4)  by 
Ap{z)  gives: 


Ap(z)G{z)  =  Bq{x)  (3.12) 

Equating  like  powers  of  z  on  both  sides,  and  recognizing  that  g[0],  .  .  .  ,g[q]  are 

known,  gives  q  + 1  linear  equations  relating  the  bn  q  and  an  p  coefficients. 


• 

g[q-p 1  *  *  g[q  1 

•  • 

aPJ> 

= 

• 

J?  [o]  s['l 

1 

0  *[01 

1 

4  "  0 P*a 


(3.13) 


(3.14) 


where  nt  define  the  vectors  & ,  a  and  the  matrix  Gp^  in  the  obvious  way.  Note  that 
Gpq  is  a  (q  + 1)  x  (p  + 1)  rectangular  upper  triangular  Toeplitz  matrix.  Now  multiply 
both  sides  of  (3.2)  by  Aa(z)  and  substitute  (3.4)  to  get: 

=  f2S,(OG‘(l/r’)  (3.15) 

oe 

Since  P  (z )  =  2  *  [*  1*  ~H »  and  we  know  R  [0],  ...,/?  [p  ]  and  g  [0] . g  [g],  by 

equating  terms  with  like  powers  of  z  on  both  sides  of  (3.15),  we  can  construct  p  + 1 

more  linear  equations  relating  the  an  ^  and  bn  coefficients: 

\\  ■  ” 
ja  .  q 

*[oi  •  .  s  [«-p1  0 

;  .  .  j  '  .  y2  .  '  (3.16) 

«t:Pi  •  mJ\aY  l  ,'i,i  •  ,-[o]|  [  Y 

or: 

Rpfl  =  72G«,4  (3.17) 

where  we  define  Rp  as  the  (p  +  l)x(p  +  1)  Toeplitz  correlation  matrix  on  the  left  of 

(3.16),  and  where  is  the  Hermitian  (complex  conjugate  transpose)  of 

Gp  q .  We  can  eliminate  the  zero  polynomial  coefficients  k.  by  substituting  (3.13),  giv¬ 
ing: 

K^(p,9)fl=Q  (3.18) 

where  K y(p,q)  =  Rp  -  y2 G**q  Gp A 

Computing  the  solution  to  (3.18)  requires  finding  a  value  of  -y2  such  that  K .(p  ,q)  has 
a  non-trivial  null  space.  Then  a  is  an  appropriately  scaled  member  of  this  null  space. 
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This  matrix  K^(p  ,q )  has  a  very  interesting  structure.  Suppose  that  there  exists 


some  minimum  phase  polynomial  G(z)  and  gain  y2  such  that  this  polynomial  matches 
the  given  leading  impulse  response  and  correlation  values: 

£[«]  =  g[n]  for  n  =0.  ...  ,q  (3.19) 

y2  2  £*[*!£[* +*]  =  *[«]  for  n=0 . p 

*=o 

We  do  not  assume  that  the  polynomial  G  (z)  has  maximum  entropy;  in  general,  there 
will  be  many  such  polynomials.  Substituting  into  (3.18)  shows  that  the  Ky(p  ,q)  matrix 
must  equal: 

,  .  [*’[*-!>  I 

KJPA)  =  y2  2  : 

‘■’+l  l  i't*] 

The  matrix  K .{p  ,q)  is  thus  equal  to  the  covariance  matrix  formed  from  the  tail  of  the 
g[n].  This  matrix  K -(p,q)  must  therefore  be  symmetric  and  positive  semi-definite. 

We  conclude  that  the  MEM  gain  y2  must  be  chosen  so  that  Ky(p  ,q)  is  positive  semi- 
definite  with  a  non-trivial  null  space,  because  otherwise  the  MEM  model  impulse 
response  g  [n  ]  could  not  possibly  meet  the  constraints. 

To  calculate  the  MEM  gain,  we  multiply  (3.18)  through  by  1/y2,  thus  converting 
the  equation  into  the  following  generalized,  conjugate  symmetric  eigenvalue  problem: 


(  g[k~p]  •  ‘  *[*]  ) 


(3.20) 


(3.21) 


3 


for  all  generalized  eigenvalues  and  corresponding  eigenvectors  2,-.  (Note  that  the 
conventional  form  of  the  generalized  real  symmetric  eigenvalue  problem  requires  the 
matrix  on  the  right  hand  side,  Rp,  to  be  conjugate  symmetric  and  positive  definite 
[18, 19]).  The  MEM  gain  should  then  be  set  to  the  inverse  of  one  of  these  generalized 
eigenvalues,  and  the  pole  coefficient  vector  should  be  set  to  the  corresponding  eigen- 

y 

vector.  But  which  eigenvalue  do  we  choose  for  y  ?  Appendix  A  proves: 

Lemma  1  Assume  that  Rp  >  0.  Then  the  generalized  eigenvalue  problem 
(3.21)  has  p  + 1  non-negative  eigenvalue  solutions  Xf .  Let  be  the  largest 
eigenvalue;  this  will  always  be  strictly  positive,  XmaT  >  0.  Then  for  y2  in  the 
range  0  ^  y2  <  the  matrix  K^(p,q)  will  be  strictly  positive-definite. 

For  y2  =  1/Xmax  the  matrix  Ky(p  tq )  is  positive  semiklefinite  with  a  non-trivial 
null  space.  For  y2  >  l/XmaT,  the  matrix  1 Cy(p  ,q )  is  not  positive  semi-definite. 

Using  this  result,  Appendix  A  proves: 

Theorem  1  Assume  that  >  0.  Then: 

A)  y2  =  1/Xmax  **  the  only  choice  for  the  gain  which  satisfies  the  requirements  of 
the  Maximum  Entropy  problem. 

B)  Let  P  (z )  =  y2G  (z  )G  *(l/z  *)  be  any  power  spectrum  with  minimum  phase  fac¬ 
tor  G(z),  which  matches  the  leading  impulse  response  coefficients 

g[0],  .  .  .  ,g[q]  and  the  correlations  R [0] . R\p\-  Then  the  entropy  of 

this  model,  which  has  value  H  =  logy2,  is  bounded  above  by//  s  logy2. 

We  will  call  the  model  with  gain  y  =  l/XmaT,  and  pole/zero  polynomials  Ap(z), 
Bq(z)  which  satisfy  (3.18)  and  (3.13)  the  MEM(p,q)  solution.  We  can  calculate  this 
MEM  ( p,q )  power  spectrum  by  the  following  procedure: 

1)  Recursively  compute  the  minimum  phase  coefficients  g[0],  .  .  .  ,g[q]  from  the 
cepstral  coefficients  c [1] . c[q  \  using  (3.11). 


■  mjt  •,»  V  V  *.* 

,V*V»  >.'•  /• . 


V 


V'AWvV-.'' 


2)  Construct  the  matrices  QPtq  and  with  (njn)th  elements 

[Gpj]ns,  =  g[q~P+m-n]  and  [Rp]n>m  =R[m-n]. 

3)  Solve  for  the  largest  generalized  eigenvalue  solution  to  (3.21)  and  its 

y 

correspo nding  eigenvector  xm„-  Set  y  =  l/XmaT.  Set  d  equal  to  a  corresponding 
eigenvector,  scaled  so  that  =  1. 

4)  Recursively  compute  the  h.  coefficients  from  (3.13). 

5)  The  MEM  ( p,q )  power  spectrum  then  has  the  form 


WO  =  7 


(3.22) 


A,(0  A/(l/z’) 

The  special  cases  of  pure  Autoregressive  (AR)  or  pure  Moving  Average  (MA)  models 
are  particularly  easy  to  solve.  For  the  MEM  (p  ,0)  all-pole  modeling  problem: 


o]  (  0  •  • • 0  1  ) 


K>, 0)  =  Rp  -  y2  0 


(3.23) 


and  equation  (3.18)  reduces  to  solving: 


1 


(3.24) 


This  is  exactly  the  problem  Burg  suggested,  giving  a  p  pole  AR  model  matching  the 
first  p  + 1  correlations  /?[0],  .  .  .  Because  Rp  is  Toeplitz,  this  may  be  solved 

with  a  fast  Levinson  algorithm[5]. 
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Another  special  case  that  is  easy  to  solve  is  the  MEM  (0 ,q  )  all-zero  modeling  prob¬ 
lem,  when  only  the  zeroth  correlation  coefficient  R  [0]  is  known,  together  with  q  cep- 
stral coefficients c [1],  .  .  .  ,c[q\.  Equation  (3.18)  gives: 

.2  _  *T01 

"  i  i*[»nJ  <325> 

n  -0 

and  the  optimal  B(z)  polynomial  is  equal  to  the  leading  impulse  response  coefficients: 

K#  -  *[»]  for  n  =0,  .  .  .  ,q  (3.26) 

Hie  MEM {p, q )  solution  is  very  similar  to  the  least  squares,  mixed  first  and 

second  order  information  problem  suggested  by  Mullis  and  Roberts  for  filter  design[9]. 
The  chief  difference  is  that  in  their  problem,  the  gain  y  is  effectively  assumed  to  be 
known,  and  the  model  is  not  forced  to  match  the  given  correlations.  As  pointed  out 
by  Mullis  and  Roberts,  this  method  is  also  quite  similar  to  Ptony’s  ARMA  modeling 
method,  in  which  we  choose  the  pole  polynomial  which  optimally  linearly  predicts  the 
semi-infinite  tail  of  the  impulse  response  data,  and  then  find  the  zeroes  from  the  lead¬ 
ing  impulse  response  coefficients.  These  similarities  allow  us  to  carry  over  many  of  the 
stability  results  and  fast  algorithms  that  have  been  developed  for  these  other  methods. 

4.  Properties  of  the  MEM(p,q)  solution 

The  matrix  K^(p ,q)  has  an  interesting  structure  which  allows  us  to  prove  a 

variety  of  interesting  properties  for  this  MEM  algorithm.  From  (3.18)  we  see  that  the 

(n  ,m  )th  element  in  this  matrix  has  the  formula: 


m 


T  Tl  'TIT!  ■  f1  ■  S  *  T*  '5' *  J  ■  l  *  J  ■  L'l  L'HJI  J-  U  '  T* 
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[kjp,$)1  =/?[«-»]- 72  i;  «*[*-p+"k[*-p+«]  (4.i) 

l  Jn»'"  *=0 

where  we  define  #  [n  ]  =  0  for  n  <0.  Careful  examination  of  this  formula  shows  that 
we  can  recursively  define  K^(p  ,q )  in  terms  of  the  lower  order  matrix  K^(p  -l,q  -1) 
in  at  least  two  different  ways: 


Ky(p-l,q  -1)  * 


KyiPfl)  = 


*  # 


and 


(4.2) 


K>,9)  - 


*[0]  K[l]  •  •  •  K[p] 

Cm 

( £ [9  ~p ]  •  *  *  g[q] ) 

*[- 1] 

2 

g  [ q~p 1 

:  Kyip-U-l) 

-  y 

• 

• 

•r  a 

(4.3) 

U(-P]  J 

g  [9] 

where  /?[«]  = 


R[n] 

tx 

VI 

o* 

S3 

R[n]-y2q%  *’[*]*[*+«] 

if  q  > p  and  niO 

k  =0 

if  n  <0 

where  *  denotes  elements  whose  values  are  not  of  immediate  interest,  and  where  we 
define  K^(/i ,m )  =  R„  if  m  <0.  These  relationships  indicate  that  K y(p,q)  is  an 
almost-Toeplitz  matrix  with  displacement  rank  of  three[20, 21].  It  is  this  structure 
which  Mulhs  and  Roberts  exploited  to  derive  a  fast  Levinson-like  algorithm  for  finding 
the  vector  d  once  -y2  is  known.  This  structure  also  allows  us  to  analyze  the  existence, 
uniqueness,  and  stability  of  the  MEM(p,q)  solution.  Some  of  the  theorems  that  fol¬ 
low  are  improved  versions  of  theorems  in  [9];  others  are  new.  For  clarity,  we  first 


state  the  results  for  the  special  (and  most  common  case)  where  the  multiplicity,  r ,  of 
the  largest  eigenvalue  Xg^,  is  r  =  1.  Appendix  B  proves  the  following: 

Theorem  2A  Assume  that  Rp  >  0,  and  that  the  maximum  eigenvalue  solution 
\m„  to  (3.21)  has  multiplicity  r  =  1.  Let  y2  =  l/Xg^.  Then: 

A)  K-(p ,?)s0,  and  its  null  space  has  dimension  1.  For  1  sjsp, 
K.(p-s,q-s)>0. 

B)  The  MEM (p  ,q )  problem  has  a  unique  solution  y,Ap(z),Bq(z). 

Appendix  C  proves: 

Theorem  3  Under  the  same  assumptions  as  in  Theorem  2: 

A)  The  MEM(p,q )  model  pole  polynomial  Ap (z )  is  stable,  with  all  poles  inside 
or  on  the  unit  circle. 

B)  If  any  poles  are  actually  on  the  unit  circle,  then  they  will  be  canceled  by 
matching  zeroes  in  the  zero  polynomial  Bq(z). 

C)  Except  for  possible  canceling  pole-zero  pairs  on  the  unit  circle,  the  MEM  (jp  ,q) 
model  will  have  no  other  canceling  pole-zero  pairs. 

Unfortunately,  it  is  quite  easy  to  construct  cases  in  which  the  MEM  (p  ,q )  solution 
places  canceling  pole-zero  pairs  on  the  unit  circle.  Consider,  for  example,  the  follow¬ 
ing  MEM  (1,1)  problem:  R  [0]  =  2,  R  [1]  =  1,  c[l]  =  0.  Using  the  recursive  equation 
(3.11),  we  derive  g[0]  =  1,  and  g[l]  =  0.  Constructing  the  Rj  and  G1>1  matrices  and 
solving  the  generalized  eigenvalue  problem  (3.21),  we  find  that  X0-  1  and  Xt  =  1/3. 
Thus  r  =  1,  and  the  unique  MEM  (1,1)  solution  is: 


This  model  has  a  canceling  pole-zero  pair  at  1.  Its  leading  impulse  response  coeffi- 

A 

dents  are  $[0]  =  1  and  i[l]  =  0.  Its  power  spectrum  is  j(z)  =  1,  and  its  correla¬ 
tions  are  R  [0]  =  1  and  R  [1]  -  0.  Note  that  although  the  impulse  response  values  are 
correct,  the  correlation  values  do  not  match  the  given  values  R  [0]  and  R  [1].  We  will 
return  to  this  problem  in  theorem  5. 

When  the  multiplidty  r  of  the  largest  eigenvalue  Xmai  in  (3.21)  is  greater  than  1, 
then  the  situation  is  more  complicated.  In  this  case,  it  turns  out  that  it  is  the 
MEM  (p  —r  + 1,?  -r  + 1)  problem  which  has  a  unique  solution.  The  MEM(p,q)  solu¬ 
tion  differs  from  the  MEM (p -r  +  1,9  -r  +1)  solution  only  in  that  it  may  contain  up 
to  r  - 1  extra  canceling  pole-zero  pairs  which  may  be  placed  anywhere.  In  more  detail. 
Appendix  B  proves: 

Theorem  2B  Assume  that  Rp  >  0,  and  that  the  maximum  eigenvalue  solution 
Xmax  to  (3.21)  has  multiplidty  r .  Let  y2  =  l/XmaT.  Then: 

A)  r  s  min(p  ,9  )  +  1 

B)  For  0si<r,  K.(p -s,q-s)  2: 0,  and  the  null  spaces  of  these  matrices  have 
dimension  r  ~s .  For  rSjSp,  K..Q?  -  s  ,q  -s)  >  0. 

C)  The  MEM(p  -r  +  1,9  -r  +  1)  problem  has  a  unique  solution  y,  Ap  _r  +  t(z), 

—  r  +  l(z)' 

D)  For  s  in  the  range  0  s:  s  <  r ,  all  solutions  of  the  MEM(p  -s  ,q  -s)  problem 
have  the  same  gain  y  as  the  MEM (p -r +  l,q -r +  1)  solution. 

E)  For  s  in  the  range  Osj  <  r ,  the  pole  and  zero  polynomials  Ap_s(z)  and 
Bq-S(z)  are  solutions  to  the  MEM  (p  -  $  ,q  -  s)  problem  if  and  only  if 
Ap.s(z)^Ap_r^(z)^r  _,_!(*)  and  Bq_s{z)  = 

where  <J>r  ~a~\{z )  is  a  polynomial  of  order  up  to  r  -s  - 1  with  leading  coeffi- 
dent  4>q  =  1.  In  other  words,  the  MEM(p  -s,q -s)  solution  is  unique  except 
for  having  up  to  r  -  s  - 1  canceling  pole-zero  pairs,  4>r  _i(z  )• 

F)  For  s  in  the  range  0sj<r,  the  MEM(p  -s  ,q -s)  power  spectrum 
Pp-sj-s^2)  *s  uniquely  defined,  and  is  equal  to  the 
MEM {p  -r  +  1,9  -r  +  1)  power  spectrum  Pp  _r  +  i4  r  +  1(z). 


Theorem  3  applies  to  the  MEM  (p  -r  + 1  ,q  -r  + 1)  model.  Because  the  r  - 1  canceling 
pole-zero  pairs  in  the  MEM(p,q)  solution  may  be  located  arbitrarily,  however,  it  is 
impossible  to  guarantee  that  the  MEM  ( p,q )  model  has  all  its  poles  inside  the  unit  cir¬ 
cle. 


Does  the  MEM  (p  ,q)  solution  match  the  given  constraints  and  solve  the  original 
Maximum  Entropy  problem?  The  answer  is  "yes"  if  the  solution  is  strictly  minimum 
phase,  but  "no"  otherwise.  In  more  detail,  Appendix  D  proves  (for  any  multiplicity  r): 

Theorem  4  Assume  that  Rp  >  0  and  that  \maT  has  multiplicity  r .  Let 
y2  =  l/Ajnjgp  and  let  Ap(z),  Bq(z)  be  any  pole-zero  solution  to  the  MEM(p  ,q) 
problem.  Then: 

A)  The  first  q  + 1  coefficients  of  the  causal  impulse  response  of  Bq{z)IAp{z)  are 
equal  to  ;[0],  .  .  .  ,g[q]. 

B)  If  the  poles  of  Ap(z)  are  strictly  inside  the  unit  circle,  then  the  first  p  +1 
correlations  of  the  model  are  equal  to  R  [0],  .  .  . ,  R  \p  ]. 

C)  If  the  zeroes  and  the  poles  are  all  strictly  inside  the  unit  circle,  so  that  the 
model  is  strictly  minimum  phase,  then  the  first  q  cepstral  coefficients  of  the 

model  are  equal  to  c[l] . cfa].  Furthermore,  the  model  is  the  solution 

to  our  original  constrained  Maximum  Entropy  problem,  with  entropy  equal  to 
H  —  logy  . 


Appendix  D  qualifies  these  results  with  the  following: 

Theorem  5  Under  the  same  assumptions  as  in  theorem  4: 

A)  If  the  model  has  poles  on  the  unit  circle  at  p;  =  e for  some  set  of  o>f ,  then 
these  will  be  canceled  by  matching  zeroes  at  pj .  The  model  correlations  R  [n  ] 
will  still  be  well  defined,  but  they  may  no  longer  match  the  given  correlations, 
and  the  error  may  have  the  form: 

*j-i 

R[nJ-i?[n|  =  2  2  ^iJn2jPp  (4.5) 

i  j  =0 


where  the  y.*  j  are  some  set  of  weights,  and  where  st  is  the  multiplicity  of 


B)  If  the  poles  are  strictly  inside  the  unit  circle,  but  one  or  more  zeroes  are 
strictly  outside  the  unit  circle,  then  the  model  will  not  match  the  cepstral 
values  c [1],  .  .  .  ,c[q\,  and  the  model  entropy  will  be  larger  than  logy2. 

C)  For  either  cases  A)  or  B),  there  does  not  exist  a  finite  and  strictly  positive 
power  spectrum  meeting  the  Paley- Wiener  condition  which  matches  the  given 
correlation  and  cepstral  values,  and  which  also  achieves  the  maximum  possible 
entropy. 

Note  that  in  our  MEM  (1,1)  example  above,  with  the  canceling  pole-zero  pair  at 
p0  =  1,  the  correlation  matching  error  has  exactly  the  form  predicted  by  this  theorem, 
R[n]-R[n]=  1". 

To  summarize,  if  the  solution  to  the  MEM(p,q)  problem  is  strictly  minimum 
phase,  then  it  is  the  unique  power  spectrum  solution  to  the  constrained  maximum 
entropy  problem.  On  the  other  hand,  there  are  a  variety  of  situations  in  which  we 
may  encounter  difficulties: 

1)  If  the  correlation  matrix  Rp  is  only  positive  semi-definite,  then  these  theorems  do 
not  apply.  In  this  case,  only  one  power  spectrum  exists  which  matches  the  known 
correlations,  and  this  is  a  line  spectrum  formed  from  p  impulses[10].  The  entropy 
and  the  cepstral  coefficients  of  this  spectrum  are  not  well  defined. 

2)  If  the  maximum  generalized  eigenvalue  solution  to  (3.21)  has  multiplicity 
r  >  1,  then  canceling  pole-zero  pairs  may  occur  outside  the  unit  circle,  and  so 
Ap(z)  may  be  unstable. 

3)  The  solution  to  the  MEM  (p  -r  +  \,q  -r  +  1)  problem,  though  unique,  could  have 
one  or  more  canceling  pole-zero  pairs  located  on  the  unit  circle.  In  this  case,  the 
model  may  not  match  the  given  correlations  or  cepstra. 
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4)  Even  if  all  the  poles  of  the  MEM  (p  -r  +  \,q  -r  +  \)  solution  are  strictly  inside  the 
unit  circle,  one  or  more  zeroes  may  be  outside  the  unit  circle.  In  this  case,  the 
model  will  match  the  correlations  R [0],  .  .  .  ,R\p]  and  the  causal  impulse 
response  will  match  g[0],  ■  ■  ■  ,g[q],  but  since  the  model  is  not  minimum  phase, 
it  will  not  match  the  cepstra  c [1] . c[q]. 

In  practice,  when  we  estimate  the  correlations  and  cepstra  from  finite  segments  of 
data,  this  fourth  case  occurs  quite  often.  As  a  result,  in  practice  the  MEM(p,q)  solu¬ 
tion  often  does  not  solve  the  original  Maximum  Entropy  problem.  We  discuss  this 
further  in  section  7,  where  we  present  several  as  yet  unproven  speculations.  In  prac¬ 
tice,  when  excessively  high  model  orders  are  used,  we  should  also  note  that  it  is  com¬ 
mon  to  see  nearly  canceling  pole-zero  pairs  appear  near  the  unit  circle,  causing 
extremely  sharp,  randomly  located  spikes  in  the  power  spectrum. 

5.  Algorithms 

The  most  straightforward  algorithm  for  solving  the  MEM (p  ,q)  problem  is  to  fol¬ 
low  the  procedure  outlined  in  section  3,  building  the  matrices  Rp  and  Gp  q,  and  then 
calling  the  RSG  driver  in  the  EISPACK  library[22]  to  compute  all  the  generalized 
eigenvalues  X;  and  eigenvectors  ^ .  If  the  multiplicity  r  of  the  largest  eigenvalue  is 
greater  than  one,  then  the  model  order  is  too  high  and  pole-zero  cancellation  is  a  pos¬ 
sibility.  (In  practice  we  would  have  to  test  whether  the  largest  eigenvalues  are  equal  to 
within  a  certain  tolerance).  Either  decrease  the  model  order  to 
MEM {p  -r  +  l,9~r  +  l)  and  try  again,  or  else  find  a  linear  combination  of  the  eigen- 


vectors  corresponding  to  X,^  which  have  the  last  r  -1  coefficients  all  equal  to  0.  In 


either  case,  we  set  -y2  =  1/Xj^,  derive  the  pole  polynomial  by  scaling  the  eigenvector 
corresponding  to  X,^,  and  derive  the  zero  polynomial  from  equation  (3.13). 

A  faster  algorithm  relies  on  the  almost-Toeplitz  displacement  rank  3  structure  of 
Ky( p  ,q).  We  start  by  considering  the  following  set  of  minimization  problems: 

an  -  minaH1Li(n,n+q-p)a  (5.1) 

for  n  =  0,  ...  ,p ,  where  a  =  (an>n  *  •  •  a 1  )T  is  a  vector  constrained  to  have 
coefficient  ag,/i  =  and  dn  is  the  value  at  which  the  function  achieves  its  minimum. 
Also  define  en  as  the  value  of  this  quadratic  function  at  the  minimum: 

en  =  a„H  Ky(n  ,n+q-p)aH  (5.2) 

If  we  choose  y^l/X^,,,  then  by  Lemma  1,  K^(p  ,?)>0.  Since  by  (4.2), 

K^(n  ,n+q-p )  is  a  principle  minor  of  Ky(p  ,q),  we  must  have  ,n+q-p)  2: 0 
also.  Thus  (5.1)  involves  minimizing  a  quadratic  positive  semi-definite  function.  Dif¬ 
ferentiating  with  respect  to  the  real  and  imaginary  parts  of  ak  ^  and  setting  the  results 
to  zero,  we  get  a  set  of  linear  equations  for  the  minimizing  vector  On  •  Substituting 
into  (5.2)  we  get  a  formula  for  e„ .  Combining  these  gives: 


K^(n  ,n+q-p)an 


0 

0 


(5.3) 


e 


n 


Because  K^(n ,n  +q  -p)  2=  0,  we  have  «„  ^  0.  Note  that  our  MEM{p,q)  problem  is 
therefore  identical  to  solving  the  minimization  problem  (5.1)  with  gain  y2  =  1/Jw; 


with  this  gain  we  will  have  *.p  =  0. 


Mullis  and  Roberts[9J  recognized  that  the  displacement  rank  3  structure  of 
Ky(p  ,q)  allows  us  to  recursively  and  quickly  compute  the  solution  in  terms  of 

the  €b_i  solution.  If  the  correct  gain  y2  =  l/XmaT  were  known,  therefore,  we 
could  run  this  fast  Levinson-style  algorithm  to  compute  an ,  e„  for  n  =  0,  .  .  .  ,p. 
With  this  correct  gain  y2,  we  will  get  ep  =  0,  and  the  vector  Op  will  be  the  pole  poly¬ 
nomial  solution  to  the  MEM ( p,q )  problem.  We  will  show  that  the  same  algorithm  can 
also  be  used  to  test  the  positive  definiteness  of  K^(p  ,q ).  With  the  guidance  of  Lemma 
1,  therefore,  we  can  develop  a  search  algorithm,  based  on  this  fast  Levinson-style  algo- 
rithm,  to  find  the  value  y  which  makes  K.(p  ,q )  exactly  positive  semi-definite.  The 
MEM  (p  ,q)  pole  polynomial  coefficients  a-Op  will  be  computed  at  the  same  time. 
At  least  in  theory,  this  algorithm  can  also  be  used  to  detect  when  the  multiplicity  r  of 
the  largest  eigenvalue  is  greater  than  1;  in  this  case,  it  automatically  gives  the  unique 
pole  polynomial  solution  to  the  MEM  (p  -r  +  l,q  -r  + 1)  problem. 

The  Mullis  and  Roberts  algorithm  is  motivated  by  the  two  recursive  definitions  of 
K^(p,g)  in  terms  of  K^(p  -l,q  -1)  given  in  equations  (4.2)  and  (4.3).  In  order  to 
solve  for  a„  efficiently,  we  will  need  to  calculate  auxiliary  vectors 
4.  =  (d0s  '  *  *  4.^.  )Tand£,  =  (/0>n  •  •  •  fn>n  )T  which  satisfy: 

1 

0 


l^(n,n+q-p)dn  = 


0 


(5.4) 


(5.5) 


I Ly(n,n+q-p)fn  = 

g’[n+q-p] 

We  will  also  need  to  recursively  calculate  the  2x2  matrix  M„  defined  by: 


K^inyti+q-p)  (  dn  j  (5-6) 

In  Appendix  E  we  show  that  all  these  quantities  can  be  recursively  computed  by  the 
following  Levinson-style  algorithm: 

Mullis-Roberts  Algorithm: 

Initialization: 


R[n]  = 


R[n] 

4*0 


for  n  =0,  .  .  .  ,p 


fl0  =  (>) 


€0  =  *(0)  -  y2\g[q-p\  |2 

^0  =  (  1/e0  ) 

/o  = 


For  n  =  1, . . .  ,  p 

4>„  =  2 


+M  "  2  !t\<l-p+n-k\akfn- 1 
k  =0 


=  {  «M-1  - 


for  k  =  0 


$ndn-k*-\  +  y1*nfn-ksi-\  tor  k  =  . 


vn  '♦'n 

•  =  M_  _i  2 

M>n  1  -in 


».  -  Cl 


€n  =  «*-l  "  vn  + 

DF  n  =  p  OR  e„s0  RETURN( n  , eH  ,an) 

vn 

dk ,n  ~  dk,n-\  ~  ~  an-kjt  ^or  *  =  0,  .  .  .  ,  n 

e/i 

p.„ 

fkjn  ~  fkj i-l  ~  ”  an-kji  k  =  0 . n 


M.  =Ma 


+  A_  V"  ( 

1  «a  *n 


The  computation  on  each  pass  is  approximately  6 n  *- 11  operations  (1  operation  ~  1 
multiply  +  1  add).  Total  computation  is  thus  about  3 p2+ 14 p  operations. 

Note  that  we  terminate  this  iteration  when  en  s  0.  The  purpose  of  this  is  as  fol¬ 
lows.  If  €n  <  0  for  some  n ,  then  by  equation  (5.2),  ,n  +q  -p)  cannot  be  posi¬ 
tive  semi-definite.  Since  by  (4.2)  this  matrix  is  a  principle  minor  of  K^Q?  ,q),  this  last 
matrix  cannot  be  positive  semi-definite  either.  Thus  y2  is  too  large,  and  we  might  as 
well  terminate  the  iteration. 

More  interestingly,  if  y2  =  y2  =  l/XmftT,  then  K^(n  ,n  +q  -p)>  0  for 
Osr  <p-r  + 1  and  K,.(p -r  +  l,q -r  +  1) 2: 0  with  a  null  space  of  dimension  one. 
Equation  (5.2)  then  guarantees  that  «„  >  0  for  n  ~  0,  .  .  .  ,p  -r  and  e_  _r+1  =  0. 
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Also  the  minimizing  solution  ap  _r+i  will  be  the  MEM(p -r  + 1  ,q  -r  + 1)  pole  polyno¬ 
mial  solution.  Thus  if  y2  =  y2,  the  test  will  tenninate  the  algorithm  at  step 

n  =  p-r  + 1,  and  will  return  the  MEM(p-r  +  l,q -r  +  1)  solution  flp_r+j. 

This  algorithm  has  some  interesting  features  when  p  >  q.  In  this  case,  for 
n  =  0,  .  .  .  ,p  -q  - 1  we  will  have: 


In 


(5.7) 


-(:•:) 


=  0 

vn  =  $„/«» 

For  the  first  p-q-1  steps,  therefore,  this  algorithm  has  the  same  form  as  Levinson 
recursion,  where  an  plays  the  role  of  the  forward  predictor,  tnd„  plays  the  role  of  the 
backward  predictor,  cR  is  the  prediction  error,  and  -  <J>R/eB  is  the  reflection  coeffi¬ 
cient.  Furthermore,  all  these  values  on  the  first  p-q-1  steps  will  be  independent  of 
the  value  of  y2. 

This  algorithm  will  work  correctly  for  any  value  of  y2,  provided  only  that  e„  *  0 
for  n  -  0,  .  .  .  ,p  -1.  In  particular,  K y(p,q)  does  not  have  to  be  positive  semi- 
definite.  In  fact,  the  values  of  e„  can  be  used  to  test  the  positive  definiteness  of 
K 7(p  ,q).  Appendix  F  proves  the  following: 


Theorem  6  Assume  that  we  can  find  vectors  a*  and  scalars  e*  which  satisfy 
(5.3)  for  n  =  0,  .  .  .  ,p.  Then: 

A)  If  0 ^ y2 <  l/XmaT,  then  •  •  •  sep  >0.  The  Mullis-Roberts  algo¬ 

rithm  will  correctly  generate  all  the  a* ,  «„ . 

B)  If  y2  =  l/X^,  then  e0s  •  •  ^  ep_r  >  0  and  ep_r  +  i  =  •••-«,-  0, 

where  r  is  the  multiplicity  of  the  largest  eigenvalue  XmaT.  The  Mullis-Roberts 
algorithm  will  correctly  generate  eB  for  n  =  0,  .  .  .  ,p -r  +  1.  For 
n  =  p-r+2 . p  we  can  seta*  =  ( 0  a?~  j  )T  and  ert  =0. 

C)  If  y2  >  l/Xjna,,  then  at  least  one  eH  is  strictly  negative. 

This  theorem  suggests  a  simple  binary  search  algorithm  for  finding  the  proper  gain 
value  -y2  =  l/X^,  and  for  calculating  the  corresponding  pole  coefficients  ap .  The 
idea  is  to  guess  a  value  for  y2,  and  then  run  the  recursive  Mullis-Roberts  algorithm.  If 
the  €„  are  all  strictly  positive,  then  K. y(p  ,q )  >  0  and  the  value  of  y2  is  too  low.  If  one 
of  the  is  negative  or  zero,  then  K^(p  ,q)  is  not  positive  definite,  and  the  value  of  y2 
is  either  exactly  correct  or  too  high-  Adjust  the  value  of  y2  accordingly,  and  try  again. 
When  the  value  of  y2  is  known  to  sufficient  precision,  we  can  stop  the  search,  and  use 
the  ap  vector  as  the  MEM  (p,q)  pole  polynomial  solution.  The  zero  polynomial  coeffi¬ 
cients  may  then  be  found  using  (3.13).  Putting  all  this  together,  we  get: 

Complete  MEM(p,q)  Algorithm: 

Initialization 


tol  -  (small)  preselected  error  tolerance 


Iterate  until  y#  -  y  2  <  to/ 

y2  =  +  7l)/2 

n ,  e . ,  a.  -  return  values  from  Mullis-Roberts  algorithm,  using  gain  y2 

ft  /I 

If  n  =  p  and  €p  >  0 

set  y  2  ~  y2  “d  iterate 

Else 

set  7 ]j  =  y2  and  iterate 

When  7/j  —  7 2  <  to/,  then  DONE 

Multiplicity  r  =  p  -n  + 1 

Calculate  the  zero  polynomial  coefficients  from  (3.13) 

Return  7 h>  r >  <*k,p-r  +  l>  bk,q-r+l 

At  each  step  in  this  binary  search  algorithm,  y 2  and  7^  represent  lower  and 
upper  bounds  on  the  feasible  range  of  values  of  the  gain,  y2  <  y2 ^  7^.  The  initial 
value  7  2  =  0  comes  from  Lemma  1.  The  initial  value  of  y  jj  is  found  by  noting  that 
for  y2  larger  than  this,  the  (p  j> )  element  of  K^O P  >9 )  *5  negative,  and  thus  K^(p  ,q ) 
cannot  possibly  be  positive  semi-definite.  Each  loop  tests  the  value  at  the  center  of  the 
range,  decides  whether  it  is  greater  than,  less  than  or  equal  to  7  .  Thus  we  can  cut 
the  feasible  range  of  values  for  7  in  half.  Faster  procedures  than  binary  search  could 
probably  be  devised,  but  binary  search  has  the  advantage  of  achieving  a  given  level  of 
precision  for  y2  after  a  fixed  number  of  passes. 

Due  to  numerical  inaccuracies,  it  is  likely  that  the  preceding  algorithm  will 
underestimate  the  multiplicity  of  the  largest  eigenvalue,  and  thus  overestimate  the 


model  order.  One  potential  method  of  improving  this  model  order  identification  is  to 
run  the  binary  search  algorithm  to  estimate  y2.  Then  call  the  Mullis-Roberts  algorithm 
again,  this  time  using  a  gain  y2  +  toll,  where  toll  is  a  small  positive  number.  With 
the  gain  set  slightly  too  high,  the  matrix  will  be  slightly  non-positive  definite.  In  par¬ 
ticular,  if  r  is  the  true  multiplicity  of  the  largest  eigenvalue,  then  e^_r+1  will  be 
slightly  negative,  and  the  iteration  will  stop  at  step  p-r  + 1  despite  small  numerical 
errors.  Thus  the  true  multiplicity  r  will  be  correctly  identified,  and  the  Mullis-Roberts 
algorithm  can  be  run  one  last  time,  with  the  correct  gain  and  model  order 
MEM(p  -r  + 1  ,q  -r  + 1),  to  identify  the  correct  low-order  ARMA  model. 

If  p  >  q ,  then  the  first  p  -q  + 1  steps  of  the  recursive  algorithm  will  give  the  same 
results  regardless  of  the  value  of  y2.  To  save  time,  therefore,  we  need  only  restart  the 
recursive  algorithm  at  step  n  =  p  ~q  for  each  new  value  of  y2  that  we  test. 

The  following  theorem,  proved  in  Appendix  G,  is  sometimes  useful: 

Theorem  7  Assume  that  >  0.  Suppose  that  y2  <  y2  so  that 
€q . t.p  >  0.  Then: 

A)  The  polynomial  Ap{z)  =  £  ak„z~k  is  strictly  stable  with  all  roots  strictly 

*=o 

inside  the  unit  circle. 

B)  Form  the  polynomial  Bq(z)  by  substituting  the  coefficients  of  Ap(z)  into  the 
recursion  (3.13).  Then  the  leading  coefficients  of  the  impulse  response  of 
yBq  (z  )/Ap  (z )  match  g  [0],  .  .  .  ,  g  [q  ],  but  the  correlations  R  [n  ]  of  the  model 
are  off  by: 
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6.  Pisarenko’s  Method 

There  is  a  special  case  of  our  MEM (p  ,q)  method  which  turns  out  to  be  identical 
to  Pisarenko’s  well  known  method  for  fitting  a  harmonic  line  spectrum  to  a  set  of 
correlations[10].  Assume  that  there  are  at  least  as  many  zeroes  as  poles,  q  a:  p ,  and 

that  the  known  cepstral  values  are  all  zero,  c  [n  ]  =  0  for  n  =  1 . q.  Also  assume 

that  Rp  >  0  (this  is  not  necessary  to  prove  the  following  results,  but  it  allows  us  to 
apply  our  theorems  without  modification).  From  the  recursive  equation  (3.11),  the 
leading  minimum  phase  coefficient  values  must  be  g  [0]  =  1  and  g  [n  ]  =  0  for 
n  =  1,  .  .  .  ,  q .  This  implies  that  Gj*q  Gp  ^  -  I  is  an  identity  matrix,  and  that  the 


generalized  eigenvector  problem  (3.21)  will  take  the  form: 


—  \f  Rp  gf  (6-1) 

The  solutions  for  \{  will  be  the  inverse  of  the  eigenvalues  of  Rp ,  and  the  ^  will  be  the 
corresponding  eigenvectors  of  Rp.  Thus  -y2  =  l/X,^  will  equal  the  minimum  eigen¬ 
value  of  Rp,  and  the  MEM(p,q)  pole  polynomial  coefficients  dp  will  be  the 
corresponding  eigenvector  of  Rp .  From  (3.13),  the  corresponding  zero  polynomial  will 


satisfy: 


.  _  [  **, 
bk,q  ~  \  0 


for  k  =  0,  .  .  .  ,p 


(6.2) 


and  thus  Bq(z)  =  Ap(z).  Thus  every  pole  must  be  canceled  by  a  matching  zero,  and 
any  extra  zeroes  will  be  placed  at  the  origin.  Suppose  that  the  multiplicity  of  the 
minimum  eigenvalue  of  Rp  is  r .  Combining  theorems  2  and  3,  at  least  p  -  r  + 1  pole- 
zero  pairs  must  be  exactly  on  the  unit  circle. 


.V..  V..*.  . -V  V-\  *  -\  V.>V*.VV'  v WV 


Since  there  is  complete  pole-zero  cancellation,  the  model  power  spectrum  must  be 
flat,  Pp  q  (z )  =  y2,  and  the  model  correlations  are  just  R  [n  J  =  -y2  B[n  J,  where  8[0]  =  1 
and  8[n  ]  =  0  for  n  *  0.  Suppose  the  roots  of  Ap  (z )  that  are  on  the  unit  circle  are  at 
{p; },  and  that  they  have  multiplicities  {rf }.  Then  the  correlation  matching  error  for¬ 
mula  in  Theorem  5  can  be  used  to  show  that: 

i,-i 

/?[n]  =  *y28[n ]  +  22  n2J  Pj"  for  n  =0,  .  .  .  ,p  (6.3) 

i  j=  0 

The  first  term  represents  the  correlations  of  a  white  noise  sequence  with  variance  y2. 
The  second  term  represents  the  correlations  of  a  sum  of  at  least  p  -  r  + 1  complex 
exponentials.  In  effect,  we  have  modeled  the  original  correlation  sequence  R  [n  ]  as  the 
sum  of  white  noise  plus  at  least  p-r  +  l  complex  exponentials  whose  frequencies  are 
determined  by  the  roots  of  Ap  (z )  which  are  on  the  unit  circle.  Note  that  the  gains 
p.,-  j  are  not  determined  by  our  procedure,  but  would  have  to  be  calculated  by  other 
means;  for  example,  we  could  calculate  these  by  factoring  Ap{z)  to  find  the  roots  p(- , 
and  then  solve  the  set  of  linear  equations  (6.3)  for  the  p.,-  j . 

Finally,  note  that,  except  for  the  calculation  of  the  zero  polynomial, 
Bq  (z)  =  Ap  (z ) ,  this  procedure  for  fitting  a  sum  of  complex  exponentials  plus  white 
noise  to  a  given  correlation  sequence  is  identical  to  that  suggested  by  Pisarenko[10]. 
The  interesting  point  is  that  we  have  shown  that  Pisarenko's  method  is  a  special  case  of 
our  pole-zero  MEM  method  when  q^p  and  when  the  known  cepstral  coefficients  c  [n  ] 
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7.  Speculation  on  Non-Minimum  Phase  MEM(p,q)  Models 


One  troubling  aspect  of  our  MEM  (p,q)  procedure  is  that  the  zero  polynomial  is 
not  guaranteed  to  be  minimum  phase,  and  thus  the  model  is  not  guaranteed  to  match 
the  cepstra.  Consider,  for  example,  the  situation  where  we  are  given  R  [0]  and  c  [1]. 
Our  MEM  procedure  says  that  the  maximum  entropy  solution  should  be  a  single  zero 
model.  To  calculate  it,  wc  compute  the  leading  coefficients  of  the  minimum  phase  fac¬ 
tor  G  (z )  as  follows:  g[0]=l,  g[l]=c[l].  The  MEM (0,1)  model  is  then: 

p0ji(O  -  i2  j ,(>)£,* (to*)  (7.1) 

where: 

B,(z)  =  1  +  bull-' 

*1,1  =  *[11  =  c[l]  (7.2) 

.2  _  _ £ 

7  i+ kmp 

The  zero  polynomial  fl^z)  has  a  zero  at  ~b\  i  -  -c[l].  For  |c[l]|  <  1,  this  model 
will  be  minimum  phase.  For  |c[l]|  >  1,  the  model  will  not  be  minimum  phase.  In 
fact,  to  illustrate  Theorem  5,  for  |c[l]J  =  |2» j  j |  >  1,  we  can  factor  />0>1(z)  into  its 
minimum  times  maximum  phase  factors  as  follows: 


^0,l(O  =  72|^i,iI2(1+-^-0(1+t4-2  *)  (7.3) 

*1,1 

The  minimum  phase  factor  is  (1  +  1/d^z  -1),  and  the  first  cepstral  value  of  the  model 
is  c[l]  =  1/6  j  i  ,  which  does  not  equal  c  [1]  =  bj  j.  Furthermore,  the  entropy  of  this 
model  is  log ( -y 2  I ^ i ,i  1 2 ) ,  which  is  larger  than  the  value  we  expected,  logy2. 


This  is  quite  puzzling,  since  it  is  easy  to  find  high-order  ARMA  models  which  are 
minimum  phase  and  which  match  both  R  [0]  and  c[l],  even  when  |c [1]  j  >  1.  For 
n  s  |c[n]|  <  n  +  1,  we  will  need  a  model  with  a  total  of  at  least  n  +  1  poles  and 
zeroes.  In  fact,  it  is  possible  to  devise  numerical  search  techniques  for  finding  high 
order  all-pole  or  all-zero  models  which  match  R  [0]  and  c  [1]  and  which  have  maximum 
entropy  for  the  given  model  order.  For  example,  using  a  numerical  search  technique 
sketched  in  Appendix  H,  we  have  found  a  minimum  phase  500  pole  model 
y2/A  5oo(z  M  500  *)  which  has  maximum  entropy  subject  to  the  constraints  that 
R  [0]  =  5.7488221  and  c  [1]  =  - 1.1163325.  The  MEM  (0,1)  model  for  this  problem  has 
model  coefficients: 

bx  i  =  -1.1163325 

2  (7-4) 

y2  =  2.5593564 

Our  AR  (500)  model  has  gain  y2  =  2.5586002,  which  is  about  .04%  below  the  theoreti¬ 
cal  upper  bound  of  y2  above  (see  Theorem  1).  This  implies  that  the  entropy  of  the 


MEM(0,1 ) 

AR(500) 

72 

2.5593564 

2.5586002 

g[  0] 

1 

1 

g  [1] 

-1.11633 

-1.11633 

g  [2] 

0 

.00794 

g[  3] 

0 

.00768 

*[0] 

5.748822 

5.748822 

*[1] 

-2.857093 

-2.877325 

R[2] 

0 

.0000698 

R[  3] 

0 

.0000915 

Table  7.1  -  Af£Af(0,l)  versus  AR  (500)  models 
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Figure  7.1  -  Error  Between  Impulse  Response  Coefficients  of  MEM(0,1)  and 

AR(500)  Models 


AR  (500)  model,  H  =  log*/2,  is  very  nearly  equal  to  the  maximum  achievable  entropy. 
The  first  4  leading  impulse  response  coefficients  and  correlations  for  the  MEM  (0,1) 
and  AR  (500)  models  are  listed  in  table  7.1.  The  remaining  impulse  response  coeffi¬ 
cients  of  the  AR  (500)  model  taper  to  zero.  Similarly,  the  peak  AR  (500)  correlation 
value  for  2^n^501  is  .00016.  Figures  7.1  and  7.2  show  the  errors  between  the 
MEM  (0,1)  and  AR  (500)  impulse  responses  and  correlations.  Note  that  the  most  signi¬ 
ficant  difference  between  these  models  is  that  the  R  [1]  correlation  of  the  AR  (500) 
model  is  significantly  larger  than  that  of  the  MEM  (0,1)  model.  In  fact,  for  the 
AR  (500)  model,  2 R  [l]/i?  [0]  >  1,  which  is  not  possible  for  a  single  zero  model.  Figure 
7.3  shows  the  power  spectrum  of  the  AR  (500)  model  (dotted  line)  and  the  power  spec¬ 
trum  of  the  MEM  (0,1)  model  (solid  line).  Note  that  the  AR  (500)  spectrum  deviates 
substantially  from  the  MEM  (0,1)  spectrum  in  that  there  is  a  deep  notch  in  the 
AR  (500)  spectrum  at  frequency  <o  =  0.  On  the  basis  of  this  and  other  similar  experi¬ 
ments,  we  speculate  that  the  following  statements  may  be  true: 

Speculation  1  Suppose  the  zero  polynomial  Bq  (z )  in  the 
MEM  (p  -r  +  l,q  -r  +  1)  problem  has  zeroes  outside  the  unit  circle.  Then 
there  exists  a  sequence  of  minimum  phase  polynomials  Gk(z)  with  gains  yk, 
such  that  the  impulse  response  of  ykGk(z)  converges  coefficient  by  coefficient 
to  the  impulse  response  y2Bq  _r+^/Ap  _r+1(z),  and  the  entropy,  log 7^,  con¬ 
verges  from  below  to  the  upper  bound  logy2.  Thus  the  MEM(p,q)  solution, 
though  not  minimum  phase,  is  a  weak  limit  of  a  sequence  of  minimum  phase 
models  which  asymptotically  attain  the  maximum  entropy  while  meeting  all 
constraints. 

Speculation  2  The  power  spectrum  ykGk(z)Gk(l/z*)  does  not  converge  point 
by  point  to  the  MEM(p,q)  power  spectrum.  Instead,  for  each  zero 
zi  =  p {eJ  '  which  is  outside  the  unit  circle,  p{  >  1,  there  is  a  notch  in  the 
response  7 2  I  Gk(e^)  at  frequency  <0  =  fy,  and  as  k  -®,  the  value  of  the 
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power  spectrum  tends  to  zero  at  each  of  these  points. 

One  way  to  characterize  minimum  phase  polynomials  Gk  (z)  is  to  compute  the 
unwrapped  phase  as  z  =  transverses  the  unit  circle  from  to  =  0  to  2 it.  We  define 
the  winding  number  as  the  difference  between  the  unwrapped  phase  at  z  =  e-,2w  and  at 
z  =  el°  divided  by  27r.  For  a  minimum  phase  polynomial,  the  winding  number  is 
zero.  For  our  MEM ( p,q )  solution,  the  winding  number  will  equal  the  negative  of  the 
number  of  zeroes  that  are  outside  the  unit  circle.  We  speculate  that  at  the  notches  in 
the  Gk(z),  the  phase  rapidly  shifts  through  a  multiple  of  2ir  so  that  the  overall  winding 
number  of  the  polynomial  is  zero;  elsewhere  the  magnitude  and  wrapped  phase  of 
Gk (z )  approximate  the  MEM ( p,q )  model. 

8.  Estimating  the  Correlations  and  Cepstra  From  Time  Series  Data 

To  apply  our  MEM  method  to  estimate  the  power  spectrum  of  a  finite  segment  of 
time-series  data,  it  is  necessary  first  to  estimate  the  correlations  and  cepstra.  Suppose 
we  are  given  complex-valued  time  series  data  x  [0],  .  .  .  ,x[tf-l],  drawn  from  a  sta¬ 
tionary  Gaussian  random  process  with  power  spectrum  M<*)-  The  ideal  correlations 
R  [n  ]  and  cepstra  c  [n  J  of  the  process  are  defined  by  formulas  (2.2)  and  (2.4).  To  esti¬ 
mate  these  quantities  from  the  given  data,  we  can  start  by  forming  an  initial  estimate 
of  the  power  spectrum  by  applying  a  low  pass  window  w[n]  to  the  data,  such  as  a 
Hamming  window,  and  then  compute  the  periodogram  IN( a>): 


px(cJ“)  =  /,(.)  =  -J-  !*(«'“)  \2 


(8.1) 
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.  .  N-l 

where  X  (e  J w)  =  2  w[njx[»]e  J<aH 

n  =0 

=Y  M»]l2 

n  =0 

Inverse  Fourier  transforming  the  periodogram  and  the  log  periodogram  then  give  esti¬ 
mates  of  the  correlation  and  cepstra  respectively: 

R[n]=  f  Px(e^)e^n  ^  (8.2) 

“IT  ** 

IT 

i\n\  =  /  k >g  (/*,(W“)  )  ^  (8.3) 

—IT 

When  the  periodogram  is  used  as  the  power  spectrum  estimate,  and  rectangular  data 
windows  are  used,  w  [n  ]  *  l,  then  the  correlation  estimate  is  equal  to: 

*[«]  «  77  2  **[*M*+«]  for  #12:0  (8.4) 

"  *=o 

In  Appendix  I  we  argue  that  for  a  rectangular  data  window,  these  periodogram- 
based  estimates  are  asymptotically  consistent: 

E[J?[n]J  ~  rt[n] 

IT 

Var[R[n]]=i/^V")^- 

— ir 

(c  [n  1  for  n  #  0 

c  [0]  -  p,  for  »  =  0 

Var[  e  [fi  1 1  ~  ^ 

where  ~  .577  is  Euler’s  constant.  Thus  the  correlation  estimates  and  the  cepstral 
estimates  for  n  #0  are  asymptotically  unbiased  as  the  number  of  data  samples  N 


(8.5) 


(8.6) 


and  their  variance  tends  toward  0. 

One  way  to  improve  the  estimates  of  the  correlations  and  cepstra  would  be  to  first 
compute  a  smooth,  consistent  power  spectrum  estimator  Px(e^w),  and  then  use  this 
estimate  in  formulas  (8.2)  and  (8.3)  for  R  [n  ]  and  c[n].  Two  basic  approaches  could 
be  used.  A  "classical"  Blackman-Tukey  approach  would  first  estimate  L  correlations 
from  the  data  via  equation  (8.4).  It  would  then  multiply  by  a  symmetric  window 

h[-L] . h[L]  which  has  a  positive  Fourier  transform,  H (e^“) > 0.  The  Fourier 

Transform  of  the  windowed  correlations  would  then  be  a  positive  power  spectrum  esti¬ 
mate  Px{e^).  This  could  be  plugged  into  the  correlation  and  cepstral  estimation  for¬ 
mulas  (8.2)  and  (8.3),  and  then  the  MEM(p,q)  algorithm  could  be  run.  (Note  that 
the  correlation  estimates  will  equal  our  original  correlations  multiplied  by  the  window 
h[n ]).  By  choosing  the  correlation  window  length  L  much  smaller  than  the  data 
length,  L  «N,  but  much  larger  than  the  MEM  model  order,  L  »p,q,  we  can  signi¬ 
ficantly  reduce  the  variation  in  the  power  spectrum  estimate,  and  thus  decrease  the 
variation  in  the  cepstral  estimates.  The  tradeoff  is  that  a  short  window  decreases  the 
resolution  of  the  power  spectrum  estimate,  and  increases  its  bias.  The  resulting  corre¬ 
lation  and  cepstral  estimates  will  thus  also  be  biased.  In  particular,  valleys  in  the 
power  spectrum  tend  to  be  "filled  in"  by  the  windowing,  and  sharp  peaks  tend  to  be 
blurred. 

An  alternative  "modem"  approach  would  start  by  fitting  a  high  order  all-pole 
MEM  model  to  the  first  L  correlations  of  the  data.  We  choose  L  to  be  much  smaller 
than  the  number  of  data  points,  L  «  N ,  but  much  larger  than  the  desired  ARMA 


model  order,  L  »  p,q.  A  Yule- Walker  method  based  on  Levinson  Recursion  could 
be  used;  this  is  identical  to  our  technique  for  fitting  an  all-pole  MEM  (L  ,0)  model  to 


the  correlations  R  [0],  .  .  .  ,/?[!].  This  idea  has  been  suggested  in  numerous  contexts 

for  similar  problems{23, 24].  The  resulting  all-pole  model  -  is  guaranteed  to  be 

AL  ) 

minimum  phase,  and  the  first  L  model  correlations  are  guaranteed  to  match  the  given 

correlations.  The  model  ^  .  is  thus  a  good  high  order  approximation  to  the 

al  vz  ) 

minimum  phase  component  of  the  power  spectrum.  We  can  use  the  impulse  response 
of  \IAl  (z  )  as  an  estimate  of  g  [n  ],  then  use  g  [n]  together  with  the  original  correlations 
R [n]  to  compute  the  MEM(p,q )  model.  (It  is  not  necessary  to  estimate  the  cepstral 
values  directly,  though  they  could  be  derived  recursively  from  g  [n  ].)  By  decreasing  the 
model  order  L  of  the  initial  all-pole  approximation,  we  can  achieve  smoother  initial 
power  spectrum  estimates  and  thereby  get  cepstral  estimates  with  lower  variance. 
However,  decreasing  the  model  order  also  increases  the  bias  of  the  power  spectrum 
estimate,  and  the  bias  of  the  cepstral  estimate. 


9.  Experimental  Results 

Figure  9.1  shows  the  result  of  using  our  MEM  method  with  various  order  models 
to  approximate  the  impulse  response  of  a  minimum  phase  8  pole,  4  zero  model.  The 
correlations  and  cepstra  were  estimated  directly  from  the  periodogram  via  equations 
(8.2)  and  (8.3),  with  a  rectangular  data  window  w[n]  =  1.  The  dotted  lines  show  the 


periodogram  of  the  data,  while  the  solid  lines  show  the  MEM  (4,0),  MEM  (6,2), 


MEM  (8,2)  and  MEM  (8,4)  approximations.  As  expected,  increasing  the  mode!  order 
produces  a  model  spectrum  which  follows  the  actual  spectrum  more  and  more  closely. 
The  MEM  (8,4)  spectrum  exactly  matches  the  data.  For  model  orders  higher  than  this, 
the  algorithm  either  detects  that  the  largest  eigenvalue  has  multiplicity  greater  than  1 
and  automatically  decreases  the  model  order  to  its  correct  value,  or  else  it  places  extra 
canceling  pole-zero  pairs  exactly  on  the  unit  circle,  as  predicted  by  Theorem  3.  In 
either  case,  the  model  spectra  exactly  match  the  ideal  spectrum. 

Unfortunately,  our  MEM  method  does  not  perform  as  well  when  we  must  esti¬ 
mate  the  correlations  and  cepstra  from  a  finite  segment  of  time-series  data.  Figure  9.2, 
for  example,  shows  the  result  of  applying  the  method  to  data  derived  by  passing  an 
impulse  train  with  period  91  through  the  same  8  pole,  4  zero  filter  as  used  in  figure 
9.1.  12  different  datasets  were  used,  each  800  points  long,  each  differing  solely  in  the 
initial  phase  of  the  impulse  train  excitation.  Three  different  methods  were  used  to  gc  n- 
erate  an  initial  power  spectrum  estimate  from  the  12  Hamming-windowed  datasets.  In 
figure  9.2a,  we  used  the  periodogram.  In  figure  9.2b  we  applied  a  triangular  lag  win¬ 
dow  to  the  first  £  =50  correlations.  In  figure  9.2c  we  used  an  initial  MEM  (50,0) 
approximation.  Correlations  and  cepstra  were  computed  via  formulas  (8.2)  and  (8.3) 
from  these  initial  power  spectra.  MEM  (10,6)  models  were  fit  to  the  estimated  correla¬ 
tions  and  cepstra  from  each  dataset,  and  the  resulting  spectra  were  drawn  in  solid 
lines,  superimposed  on  the  periodogram.  We  deliberately  used  a  model  with  an  extra 
pole  and  zero  pair  in  order  to  demonstrate  the  behavior  when  the  model  order  is  set 


Note  that  the  periodogram-based  method  in  figure  9.2a  shows  5  strong  peaks 
instead  of  4,  with  resonances  that  are  far  too  sharp.  Note  also  the  large  variance  in 
the  spectral  models  for  the  12  datasets.  4  of  the  models  are  not  minimum  phase.  In 
figure  9.2b,  using  a  SO  point  triangular  window  on  the  correlations  before  estimating 
the  correlations  and  cepstra  gives  MEM  models  which  follow  the  4  peaks  more 
robustly.  The  valleys  of  the  spectra,  however,  show  the  characteristic  "filling  in"  due  to 
the  initial  windowing  operation.  All  of  the  models  are  minimum  phase.  If  we  substi¬ 
tute  the  original  correlations  for  the  windowed  correlations,  but  use  the  cepstra  from 
the  log  windowed  periodogram,  then  we  would  get  MEM  spectra  with  S  strong,  exces¬ 
sively  resonant  peaks.  (This  is  not  shown).  The  spectra  have  a  large  amount  of  vari¬ 
ance,  and  11  of  them  are  jt  minimum  phase.  This  suggests  that  it  is  important  to 
start  with  correlation  and  cepstral  estimates  derived  from  the  same  initial  power  spec¬ 
trum  estimate. 

Using  an  initial  MEM(50,0)  model  to  estimate  the  correlations  and  cepstra  in  fig¬ 
ure  9.2c  gives  the  best  results.  The  models  have  low  variance,  and  all  conform  closely 
to  the  original  filter  shape.  Note,  however,  the  peculiar  glitch  in  many  of  the  models 
near  a  frequency  of  0.27.  This  is  caused  by  a  nearly  canceling  pole-zero  pair  located 
virtually  on  the  unit  circle.  None  of  the  models  are  minimum  phase.  The  culprit 
appears  to  be  related  to  Theorem  3.  In  all  12  datasets,  the  multiplicity  of  the  largest 
eigenvalue  is  estimated  as  r  =  1.  Given  more  poles  and  zeros  than  necessary,  however, 
the  method  tries  to  locate  the  extra  poles  and  zeroes  someplace  where  they  will  nearly 
cancel.  Unfortunately,  by  theorem  3,  if  the  multiplicity  r  =  l,  then  the  only  place 


canceling  pole-zero  pairs  can  be  located  is  exactly  on  the  unit  circle.  Since  the  extra 
poles  and  zeroes  do  not  exactly  cancel,  they  are  only  placed  near  each  other  and  near 
the  unit  circle.  The  result  is  a  sharp,  highly  variable  "glitchH  in  the  power  spectrum. 

Figure  9.2d  shows  the  results  when  the  correct  model  order,  MEM  (8,4)  is  used, 
with  an  initial  MEM  (40,0)  model  to  estimate  the  correlations  and  cepstra.  Note  that 
the  12  model  spectra  are  quite  accurate,  except  for  some  variance  in  the  depth  of  the 
high  frequency  null  in  the  spectrum.  All  are  minimum  phase. 

Figure  9.3  shows  the  result  of  applying  the  method  to  a  similar  problem,  where 
white  Gaussian  noise  is  filtered  through  the  same  8  pole,  4  zero  model  to  produce  a 
colored  Gaussian  signal.  12  different  datasets  were  generated,  each  800  points  long. 
Exactly  the  same  processing  was  used,  except  that  the  model  order  was  increased  to 
MEM  (12,8).  Figure  9.3a  was  generated  using  an  initial  power  spectrum  estimate  equal 
to  the  periodogram.  Figure  9.3b  used  a  60  point  triangular  lag  window,  while  figure 
9.3c  used  an  initial  MEM  (60,0)  model.  The  periodogram-based  and  MEM  (60,0) 
based  models  are  very  similar;  all  show  some  variance  in  the  spectral  envelope,  together 
with  2  sharp  glitches  caused  by  the  4  extra  poles  and  zeroes  being  located  in  nearly 
canceling  pairs  near  the  unit  circle.  The  triangular-window  based  models  have  much 
less  variance,  but  show  the  characteristic  "filling  in"  of  the  valleys.  Figure  9.3d  shows 
the  models  generated  with  the  correct  MEM  (8,4)  model  order,  using  the  correlations 
and  cepstra  derived  from  the  periodogram.  Note  that  the  correct  spectral  envelope  is 
captured  in  all  12  datasets. 


Figure  9.2b  -  Impulse  Train  Excited  Filter,  Triangular  Window  Based 
MEM(10,6)  Model  (solid),  Periodogram  (dotted) 
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Figure  9.3a  •  Colored  Gaussian  Noise,  Periodogram  Based  MEM(12,8) 
Model  (solid),  Periodogram  (dotted) 
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Figure  9.3b  -  Colored  Gaussian  Noise,  Triangular  Window  Based 
MEM(12,8)  Model  (solid),  Periodogram  (dotted) 
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Figure  9.3d  •  Colored  Gaussian  Noise,  AR(40)  Based  MEM(8,4)  Model 

(solid),  Periodogram  (dotted) 
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In  general,  resolution  in  the  model  power  spectrum  appears  to  be  controlled  by 
the  number  of  poles  used  in  the  model;  adding  more  zeroes  tends  to  improve  the  shape 
of  the  spectral  peaks.  In  practice  it  is  rare  for  the  method  to  detect  multiple  eigen¬ 
values,  even  when  the  model  order  is  set  relatively  high.  Non-minimum  phase  models 
do  not  appear  to  have  significantly  different  shape  than  minimum  phase  models  gen¬ 
erated  from  datasets  with  the  same  stochastic  behavior.  When  modeling  data  formed 
from  sinusoids  in  white  Gaussian  noise,  the  method  tends  to  locate  nearly  canceling 
pole-zero  pairs  near  the  correct  frequencies;  unfortunately,  these  result  in  peaks  whose 
position  and  shape  appear  to  be  highly  variable.  The  worst  problem,  in  general,  is  the 
presence  of  sharp  glitches  in  the  power  spectrum  when  the  model  order  is  high. 
Further  work  is  needed  to  devise  an  accurate  model  order  estimation  procedure,  or  to 
eliminate  this  behavior. 

10.  Conclusions  and  Further  Work 

We  have  shown  that  the  solution  to  the  Maximum  Entropy  spectral  estimation 
problem,  subject  to  constraints  on  the  first  p  + 1  correlations  and  first  q  cepstral  values, 
is  an  ARMA  model  with  p  poles  and  q  zeroes.  To  calculate  the  parameters  of  this 
model,  we  converted  the  Maximum  Entropy  problem  into  an  equivalent  generalized 
eigenvalue/eigenvector  problem.  The  inverse  of  the  maximum  eigenvalue  in  this  prob¬ 
lem  is  the  model  gain;  it  is  also  the  exponential  of  the  entropy  of  the  model.  The 
corresponding  eigenvector  contains  the  coefficients  of  the  pole  polynomial.  The  zero 
polynomial  coefficients  are  computed  by  a  simple  recursion.  The  sotution  is  unique, 


except  for  possible  pole-zero  cancellation.  The  model  is  also  stable,  with  all  uncanceled 
poles  inside  the  unit  circle.  If  all  the  zeroes  are  also  inside  the  unit  circle,  then  the 
computed  model  matches  ail  the  given  correlation  and  cepstral  values  and  it  is  the 
Maximum  Entropy  solution.  If  the  zero  polynomial  is  not  minimum  phase,  however, 
then  it  does  not  match  the  cepstra,  and  there  does  not  exist  any  minimum  phase  model 
which  matches  the  given  constraints,  has  a  finite  and  non-zero  power  spectrum,  and 
also  achieves  the  maximum  possible  entropy.  We  speculate,  however,  that  there  exist 
minimum  phase  models  which  match  the  correlations  and  cepstra,  whose  impulse 
responses  match  that  of  the  MEM ( p,q )  model  arbitrarily  closely,  and  whose  entropy  is 
arbitrarily  close  to  the  theoretical  upper  bound  given  by  the  MEM(p,q)  method. 

Several  issues  regarding  this  algorithm  require  considerable  work.  The  issue  of 
non-minimum  phase  models  needs  to  be  resolved  more  cleanly;  in  particular,  a  better 
explanation  would  be  helpful  for  why  the  non-minimum  phase  models  seem  to  yield 
spectra  which  are  about  as  accurate  as  the  minimum  phase  models.  Better  methods  are 
needed  to  estimate  the  correlations  and  cepstra  more  carefully  from  the  given  data,  or 
to  modify  the  algorithm  to  give  more  accurate  estimates  from  short  segments  of  data. 
Model  order  estimation  procedures  are  needed.  These  would  probably  be  based  on 

y 

monitoring  the  value  of  the  gain  y  or  the  error  e„  as  the  model  order  is  increased, 
and  picking  a  model  order  near  the  knee  in  this  curve.  Modifications  to  improve  the 
estimation  of  sinusoids  in  noise  would  be  helpful.  Most  importantly,  the  problem  of 
nearly  canceling  pole-zero  pairs  appearing  near  the  unit  circle  must  be  resolved. 
Despite  these  difficulties,  however,  this  new  pole-zero  MEM  method  is  quite  elegant, 
and  seems  quite  promising. 


Appendix  A  •  Proofs  of  Lemma  1  and  Theorem  1 


Proof  of  Lemma  1 

Assume  that  Rp  >  0.  Then  we  can  always  factor  Rp  in  the  form  Rp  =  C^O, 
where  Q  is  an  invertible  matrix.  (For  example,  Q  could  be  the  upper  triangular  factor 
computed  by  a  Choleski  decomposition  [18]).  Let  Q~H  be  the  inverse  of  Q1*.  Then: 

Q~hk1(p>«)Q  1  - 1  -  t2cthg«, gpa  Q-'  (A.!) 

Note  that  for  all  x. : 

ihQ“hG»,  G^CT'x  (A.2) 

where  r  =  G_„Q~lx.  Thus  the  matrix  Q"HG^  GpA  Q_1  is  positive  semi -definite, 

and  it  will  have  a  complete  set  of  orthonormal  eigenvectors  xq.  ...  ,Xp  and 

corresponding  eigenvalues  X0,  .  .  .  ,  \p ,  all  of  which  are  non-negative: 

Q~H<3pH,9Gp,f Q  ljc*  =  X«*  for  f=0 . p  (A.3) 

where  2: 0 

&iH*j  =  hi,j 

where  8i  f  =  1  and  8*  j  =  0  for  i  *  j .  Let  us  order  the  eigenvalues  so  that  X0  is  the 
largest  and  \p  is  the  smallest,  and  let  us  define  \maT  =  X0. 

We  next  show  the  connection  between  these  eigenvalues  and  eigenvectors,  and 
the  generalized  eigenvalue  problem  (3.21).  Suppose  ,  %  is  an  eigenvalue,  eigenvec¬ 
tor  pair  of  the  matrix  Q-H Gp  q  Q1.  Let  =  Q1^  •  Then: 

G",GP.,  *  -  Q"  (  Q’Hg^  g,*  Q"1 )  *  <A-4> 


Thus  X; ,  %  is  a  generalized  eigenvalue,  eigenvector  pair  solving  (3.21).  Conversely,  it 
is  easy  to  show  that  if  Xf,  q  solves  (3.21),  then  X,-  and  z,-  =  Qj^  is  an  eigenvalue, 
eigenvector  pair  of  the  matrix  Q-HG^  G p  q  Q-1.  Thus  the  generalized  eigenvalue, 
eigenvector  problem  (3.21)  has  the  same  (p  + 1)  non-negative  eigenvalues  as  does  the 

matrix  Q-HG^?  Gpq  Q1.  Also,  the  corresponding  eigenvectors  n0 . are 

linearly  related  to  the  eigenvectors  zq,  ...  ,Zp  by  %  -  Q-1Z|.  Therefore  they  are 


linearly  independent  and  satisfy: 

*Hr p*j  m  *iH*j  =  8i  J  (A-5) 

Now  suppose  that  the  maximum  eigenvalue  Xj^,  has  multiplicity  r,  so  that 

XmaT  =  X0  =  •  •  •  =  Xr_1.  For  model  orders  q  2:0,  the  matrix  Gp  q  has  at  least  one 

non-zero  element.  Since  Q  is  invertible,  the  number  of  non-zero  eigenvalues  of 

Q~HG/!<7  Gp  q  Q-1  will  equal  the  rank  of  Gp  q ;  since  this  is  at  least  one,  Xmax  >  0. 

Let  k  be  an  arbitrary  non-zero  vector.  Because  the  jlq,  ...  ,Xp  form  a  basis,  we 
can  always  write  u  as  a  linear  combination  of  these  eigenvectors: 


-i 


*  =  Z  ai*i 
i  =0 


(A.6) 


where  at  least  one  is  non-zero.  But  then,  using  ^  =  Q  ^  together  with  (A.l) 
and  (A.3): 


*HI =  £  1  ataJxlHQT*KJpA)Q-lxJ 

i  =0  7  =0 

=  £  £  a/0ty*ayH(l-72Xy)i; 
i =0 y=0 

=  £  I  a,-  l2(l-*y2^)  (A.7) 

i  =0 

For  0  ^  y2  <  1/Xmax,  every  term  on  the  right  hand  side  of  (A.7)  will  be  non-negative, 
and  at  least  one  will  be  strictly  positive.  Thus  ,q  >  0  for  all  y  *  Q,  and 

Yiy{p  ,q)  will  be  strictly  positive  definite.  For  y2  =  1/Xmar,  terms  0  through  r  -1  will 
be  zero,  so  that: 

=1  |  a,  (A.8) 

i  -r 

Thus  K y(p  ,q)  will  be  positive  semi-definite,  and  its  null  space  will  be  spanned  by  the 
vectors  *0*  •  •  •  >*t- 1-  Since  these  vectors  are  linearly  independent,  the  null  space  of 
Ky(p  ,q )  will  have  dimension  r.  For  y2>  l/X^,,  we  will  have  yJilCy(p  ,^)r0<  0, 
and  so  ICy(p  ,q  )  will  not  be  positive  semi-definite. 

Proof  of  Theorem  1 

If  there  exists  some  minimum  phase  model  y,  G(z)  which  matches  the  given 
correlations  and  impulse  response  coefficients,  then  section  3  shows  that  Ky(p  ,q)  must 
be  positive  semi-definite.  By  Lemma  1,  this  requires  y2:s  1/Xraax.  By  section  3, 
the  entropy  of  the  model  is  H  =  logy2,  and  thus  the  entropy  must  be  bounded  above 
by  H  <  logy2  =  log  l/X--.. 
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Appendix  B  -  Proof  of  Theorem  2 

Assume  that  Rp  >  0  and  that  XmaT  has  multiplicity  r .  For  notational  conveni¬ 
ence,  we  will  call  the  nth  order  polynomial  Vn(z)=  2  vkz  ~k  a  member  of  the  null 

k  =o 

space  of  K y(njn)  if  the  vector  of  coefficients  n  =  (vn  •  •  •  v0)T  is  a  member  of  the 
null  space.  (For  convenience,  we  drop  the  model  order  index  from  the  coefficients.) 
Also,  let  us  define  lty(n  jn )  =  R„  if  m  <0  but  n  s:  0. 

In  Appendix  A  we  proved  that  for  y2  =  l/X^,  then  K .(p  ,g)2  0,  and  its  null 
space  has  dimension  r.  This  null  space  is  spanned  by  the  generalized  eigenvectors 
•  •  •  » -1*  Let  look  for  vectors  in  the  null  space  of  K.(p  ,q)  whose  last  coef¬ 
ficient  is  zero,  v0  =  0.  This  represents  only  one  additional  linear  constraint  on  the 
values  of  v0,  .  .  .  ,  vr ;  thus  there  will  be  at  least  an  r  - 1  dimensional  linear  subspace  of 
vectors  v.'  -  {yp  •  •  •  vj0)T  which  are  elements  of  the  null  space  of  K.(p,q).  But 
equation  (4.2)  implies  that  if  K.(p ,q)x.'  =  Q,  then: 


K.(p-l,q-l) 


=  Q 


(B.l) 


Thus  K.(p-l,q-l)  must  have  a  null  space  with  dimension  at  least  equal  to  r-1. 

Applying  this  argument  recursively  for  5  =  1 . r-1,  we  can  show  that 

K .(p  -s  ,q-s)  must  have  a  null  space  with  dimension  of  at  least  r-s. 


Now  let  s  be  the  largest  integer  such  that  K .(p  - i,q-s )  has  a  non-trivial  null 

space.  By  the  reasoning  above,  s  -1.  Let  A_  .  (r )  be  any  non-zero  member  of 

P  * 


(B.2) 


the  null  space  of  K .(p -s,q  - s ): 

* 

ap-s 

Kfo-sj-s)  : 

*0 

First  we  show  that  <jq  *■  0.  Suppose  this  were  not  true.  If  £  =  p ,  then  we  would  have 

A  #(z)  =  0,  which  would  contradict  our  definition  that  A  _.(z)  is  a  non-trivial 
p  s  p  J 

member  of  the  null  space  of  K.(p -s,q -s).  For  s  <p ,  using  equation  (4.2)  we 
would  have: 


«l 

But  since  K.(p  ,?)s0,  and  since  K.(p  -s  -l,q  -s  - 1)  is  just  the p  - s  - 1  order  prin¬ 
ciple  minor  of  K-Q? ,q),  we  must  have  K^(p  -s -l,q -s -1)  2:  0.  By  definition  of  s , 
however,  K.(p  -s  -\,q  -s  -1)  can  have  only  a  trivial  null  space  and  so  this  matrix 
must  be  strictly  positive  definite.  But  then  (B.3)  implies  that  an  =  0  for 

n  =  1 . p  -s .  This,  however,  contradicts  our  original  assumption  that  Ap  .(z)  is 

non-zero.  Thus  we  have  shown  that  a0  *  0.  We  can  therefore  scale  Ap  _^(z)  so  that 
dft  =  1,  and  this  scaled  polynomial  will  still  be  a  member  of  the  null  space  of 


(B.3) 
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K.(p  —s  ,q  -£).  From  now  on,  we  assume  that  a0  =  1- 

Clearly  y,  Api(z)  are  gain  and  pole  polynomial  solutions  for  the 
MEM (p-s  ,q-s)  problem.  Let  us  define  B  ^(z)  to  be  the  corresponding  zero  poly¬ 


nomial  with  coefficients: 


4=2  g[n~k\dk 
k  =0 

where  we  define  dk  =  0  for  k  >  p  —  s . 


(B.4) 


Now  we  prove  that  both  A  _.(z )  and  z  _.(z )  are  members  of  the  nuU  space 


of  K .(p  -s  + 1  ,q  -s  + 1).  Using  (4.2): 


(■ V-i  *  "  “0  °) 


ap-s 


K.(p-s+l,q-s+l) 

’  *»0 


(B.5) 


( dn-s  *  *  *  «0  ) 


ap-s 


K.(p-s,q-s) 


Since  K  Ap  -s  + 1  ,q  - i  + 1)  s  0,  this  shows  that  z  1A  _  .(z)  is  a  member  of  the  null 

1  P  & 

space  of  K  J[p  -s  +1  ,q  -s  +  1). 


Next,  using  (4.3)  we  can  show  that: 


;s_i 


*.  \ 


+  l,q~s  +  1) 


(a  . 
v  p-s 


\  ap  -i 

K.(p-s,q-s) 


-  q—S  +1 

-  T  2  giq-s+l-nft 


-2  .*  2 
7 


But  since  K.(p-i  +  l,f -i  +  l)2  0,  this  equation  must  actually  equal  zero.  This 
shows  that  Ap  (z )  is  a  member  of  the  null  space  of  K^(p -§  + 1  ,q  -f  + 1).  In  addi¬ 
tion,  since  y2=  1/Xmax>0,  we  must  have  bq_s  +  1  =  0.  Since  both  Ap_.(z)  and 
z~1Ap_.(z)  are  members  of  the  null  space  of  K,.(p  -s  +  l,q -s  +  1),  and  these  two 
polynomials  correspond  to  linearly  independent  vectors  of  coefficients,  the  dimension 
of  the  null  space  of  K^(p  -i  +  l,g-i  +  l)  must  be  at  least  one  greater  than  the  dimen¬ 
sion  of  the  null  space  of  K .(p  -s  ,q  -s). 

Applying  this  argument  recursively,  we  can  show  that  for  j  =  0 . s ,  the  poly¬ 

nomials  z~kAp_.(z)  for  k  =  0,  .... s  are  all  members  of  the  null  space  of 
K .{p  -s  +s  ,q  -s  +s).  Since  these  members  are  all  linearly  independent,  the  dimen¬ 
sion  of  the  null  space  of  K .{p  ,q),  which  we  assumed  to  be  r ,  must  be  at  least  s  + 1. 


cm 


Thus  r  > i  + 1 ;  however,  we  earlier  showed  that  rsf  +  l.  We  conclude  that 


r  =5  +  1.  Furthermore,  the  same  recursive  argument  shows  that  bn  =  0  for 

n  =  q -r  +2,  .  .  .  ,  q .  This  in  turn  implies  the  following: 

a)  Matrix  K.(p -r  +  l,?-r  +  l)20  with  a  null  space  with  dimension  one.  There  is 
a  unique  member  of  its  null  space,  Ap  _r  +  1(z),  with  leading  coefficient  of  d0  =  1. 

b)  For  s  in  the  range  0  <  s  <  r ,  the  optimal  gain  for  the  MEM (p-s  ,q-s)  problem 
is  exactly  the  same  as  the  optimal  gain  y  =  l/A.^^  for  the  MEM  (p  ,q)  problem. 
(This  is  because  Lemma  1  guarantees  that  there  is  exactly  one  value  of  the  gain 
for  which  the  matrix  ICy (p  -s  ,q  ~s  )  is  exactly  positive  semi-definite;  y  is  this  gain 
value.) 

c)  For  s  in  the  range  Osj  <  r ,  the  null  space  of  K.(p  -s  ,q -s)  has  dimension 

r  -  s ,  and  is  spanned  by  Ap  _r  +  j(z  ) . z  ~r  +s  +  lAp  _r  +  j(z  ).  (The  dimension 

r-s  is  a  consequence  of  our  proof  that  s  =  r  -1.  These  vectors  must  be  a  basis 
because  we  have  shown  that  all  r  -  j  of  them  belong  to  the  null  space,  and  they 
are  all  linearly  independent;  thus  they  must  span  the  null  space.) 

d)  For  s  in  the  range  rsjsp,  matrix  K  Jip -s,q-s)>  0.  (This  is  true  because 
K  J(p  -s  ,q  -s )  is  a  principle  minor  of  K .(p  ,q),  and  thus  must  be  at  least  positive 

semi-definite.  It  is  strictly  positive  definite  because  s  =  r  - 1  is  the  largest  value  of 
5  for  which  K^(p  -s  ,q  -s)  has  a  non-trivial  null-space.) 

e)  For  s  in  the  range  Osj  <r,  Ap  _,(r)  is  a  solution  to  the  MEM  (p  -s  ,q  ~s) 
problem  with  dg  =  1.  if  and  only  if: 


«  /  /■  .*•  /  •  .'•V  "v Y*V'V-V*V»Y- y*V«V-Y Y*Y YY-Y-Y  V  Y-Y*  Y*~ 

'‘/[‘.•j*/*"/*  «'/■  *■  y*  j 

'■*  it-  *  fn,  *1  /it,  i-.-’  V,  r  p  -V  j  -V  — -N  -  •-**]«  A  wV  .%  — **s  i.  .'i  * «  i  i  ' 


Ap-s(*)=  2  4>nz  "^p-r  +  l(z)  =  <t»r-j-l(zMp-r  +  l(z)  (B.7) 

n  =0 

where  <t>r_,  _j(z)  is  some  polynomial  of  order  r-j-1  with  leading  coefficient 
4*0  =  1*  (This  is  because  Ap_s(z)  must  be  in  the  null  space  of  K.(p -s  ,q - s ), 
and  therefore  must  be  a  linear  combination  of  the  basis  vectors 
Ap  _r  _j(z),  .  .  .  ,z-r+J  +  1j4/>_r_1(z).  Conversely,  every  such  linear  combina¬ 
tion  must  be  an  element  of  the  null  space,  and  therefore  must  be  a  solution  to  the 
problem.  We  must  have  <j>0  =  1  in  order  to  make  aQ  =  1.) 

Finally,  for  s  in  the  range  0  ^  s  <  r ,  let  Ap  _J(z)  be  any  pole  polynomial  solution  to 
the  MEM (p  —s  ,q -s)  problem,  so  that  Ap  _s(z)  =  4>r  _5  -i(z)Ap  _r  +  j(z)  for  some 
r-s-1  order  polynomial  $r_s  _x(z)  with  4>0  =  1.  Let  Bq_s(z)  be  the  corresponding 
q-s  order  zero  polynomial  with  coefficients; 

A 

bn  =  2  S[n~k^ik  for  n=0,  .  .  .  ,q-s  (B.8) 

*=0 

where  we  define  ak  =  0  for  k  >  p  -  s .  Substituting  for  ak : 

bn  =  2  2  *[" -*!«*-/$/  (B.9) 

*=o  /  =o 

=  2  4»z  £  g[n-k]ak_, 

1=0  k=l 

r-s-l 

=  2  *lbn-l 

1=0 

where  an  and  bn  are  the  coefficients  of  the  A/?_r+j(z)  and  Bq  _r +1(z)  polynomials. 
But  since  bn  =  0  for  n  -  q  -  r  +  2,  .  .  .  ,  q ,  this  implies  that: 


=  4>r-s-\(z)Bq.  r  +  1(z) 


(B.10) 


Thus  Ap _ j (z )  and  Bq_s(z)  share  up  to  r-s-1  common  factors  (z).  The 

power  spectrum,  however,  depends  only  on  the  ratio  of  Bq  _s(z)/Ap _s (z),  and  there¬ 
fore  is  uniquely  determined,  and  is  identical  to  the  MEM (p  -  r  +  l,q  -  r  +  1)  power 
spectrum  Pp_r+ljq_r+1(z). 

The  multiplicity  r  must  be  less  than  or  equal  to  the  total  number  of  eigenvalues, 
rssp  +  l.  Also  for  p  +  12;  n  >  q  +  1,  K^(p -n  +l,q -n  + 1)  =  Rp_B+1  >  0.  Thus 
r^^  +  1.  Combining  gives  r  s  min(p  ,$)+ 1. 


Appendix  C  •  Proof  of  Theorem  3  and  part  A  of  Theorem  7 


Assume  that  Rp  >  0  and  that  has  multiplicity  r .  Set  -y2  in  the  range 
0s72sl/Xmax,  so  that  JCy(p  ,<?)  s  0.  For  convenience,  we  will  assume  that  if 
y2  =  y2  then  r  =  1;  the  case  r>l  can  be  treated  by  replacing  p  and  q  everywhere 
below  with  p-r  +  1  and  q-r  +  1  respectively.  For  notational  convenience,  let 
K  =  K y(p  ,q).  Let  dp  be  a  solution  to  the  minimization  problem  in  (5.1)  for  n  =  p : 

4>  -  ““  apHKar  (C.1) 

Then  ep  =  a^Kdp  s 0. 


If  p  =  0,  then  Ap  (z )  =  1,  and  the  pole  polynomial  is  stable.  For  p  >  1,  let  us  fac¬ 
tor  the  polynomial  Ap{z)  =  (1-pz -1)4»(z)  where  i|i(z)  is  a  p-1  order  polynomial 
with  v|/q  ~  1 ,  and  p  is  one  complex  root  of  Ap  (z).  Minimizing  (B.3)  over  all 

a i,  ...  ,ap  is  equivalent  to  minimizing  over  all  4*1 . 'I'p-i  and  p.  In  particular, 

if  we  factor  the  minimizing  polynomial  Ap(z)  =  (1-pz  _1)4>(z),  and  define  vectors: 


fVi] 

'  0 

; 

*p- 1 

= 

4»o 

±-  - 

.  0  . 

.  *0  , 

then 

*  / 

4>+  ±- 

1  f  “p ) 

=  ( 

u  t  J 

(C.2) 


(C.3) 


and  p  must  be  the  solution  to  the  following  quadratic  minimization  problem: 
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p  -  min 

p 


(v.n*n  .  ,f-p) 

1  4h  k(4+S.)  |  ,  j 

k  J 


(C.4) 


This  minimum  occurs  at: 


p  =  - 


i?*£_ 


(C.5) 


and  the  value  at  the  minimum  is: 


(C.6) 


(C.6) 

p  -  - 

But  €p  a:  0  and  by  am  jugate  symmetry,  ^  »^l1Kifr+  j  .  Thus  combining 


(C.5)  and  (C.6): 


I  - 12  i  Ki-  ~  h 
'p'  " 


(C.7) 


Using  (C.2)  and  (4.2): 


<?K*+-  (*’-«  -*•*)  K+-U-1) 


(C.8) 


where  strict  inequality  must  hold  since  p  >  1,  r  =  1,  i|»0=  1,  and  by  Theorem  2, 


Ky(p  -l,q  -1)  >  0.  This  guarantees  that  (C.5)  is  well-defined.  Using  (C.2)  and 


(4.3): 


.?K*_  -  1  ^  K.(p  -  l,q  - 1) 


a  2  i  *■  i  2 

~  7  I  I 


(C.9) 


M  •.  ■*  sA^V  i  V4  V  i 4.T  i*  %  «  J'S»J  a  .  v»  v  Jai  £  i  .  n  .  u  \  


raw* 


where  we  define: 


Aw  =  2  8 
*=o 


(C.10) 


and  where  we  define  =  0  for  k  >  p  - 1.  Combining  (C.9)  and  (C.7): 


±+K±+ 


(C.11) 


The  pole  p  must  therefore  be  inside  or  on  the  unit  circle.  Since  we  chose  the  root  arbi¬ 
trarily,  all  poles  in  Ap  (z  )  must  be  inside  or  on  the  unit  circle. 

If  the  gain  y2  is  below  y2  =  then  K  >  0  and  ep  >  0.  Thus  |p  |  <  1,  and 

every  root  of  Ap(z)  will  be  strictly  inside  the  unit  circle.  (This  proves  part  A  of 
Theorem  7).  If  y2  =  y2  =  l/XmaT,  however,  then  *p  =0,  and  Ap(z)  will  be  the 
MEM  (p  ,q)  pole  polynomial.  All  the  roots  of  Ap{z )  will  be  either  inside  or  on  the 
unit  circle.  From  (C.U),  a  root  of  Ap (r )  can  be  on  the  unit  circle  if  and  only  if 
|i4  =  0.  But  then  if  we  examine  the  zero  polynomial  corresponding  to  Ap(z)  : 


K  =  2  8[n-k\ak 

k  =0 
n 

=  2  s[n~k)^k  ~ 

*=0 

■  A*  -  PA„-1 


(C.12) 


so  that: 


w 

mt\ 

Bq(z)  =  (l-pz-1)^!)  (C.13) 

where  |X(z)  is  a  q  -1  order  polynomial  with  coefficients  ....  |1^  _j.  Thus  if  the 

MEM  ip, q)  pole  polynomial  Ap(z)  has  a  pole  p  on  the  unit  circle,  then  the 
corresponding  zero  polynomial  Bq(z )  must  have  a  canceling  zero  on  the  unit  circle  at 
P- 

Now  we  will  prove  that  all  canceling  pole-zero  pairs  in  the  MEM  (p  ,q)  model 
must  be  on  the  unit  circle.  Suppose  that  Ap(z)  and  Bq(z)  share  a  common  factor 
(l-pz-1).  As  above,  Ap(z)  =  (1-pz -1)»Ji(z),  where  <|»(z)  is  an  order  p-1  polyno¬ 
mial  with  leading  coefficient  i£q  =  1.  The  coefficients  of  the  corresponding  zero  poly¬ 
nomial  must  obey  (C.12).  This  implies  that: 

=  (1-P*-1)  In 

But  we  assumed  that  (1-pz-1)  was  a  factor  of  Bq (z);  thus  ji?  =  0.  But  then  from 
(C.ll),  |  p  |  =  1,  and  so  this  canceling  pole-zero  pair  must  be  on  the  unit  circle. 
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Appendix  D  -  Proof  of  Theorems  4  and  5 


Proof  of  4A) 

Let  g  [n]  be  the  causal  impulse  response  of  Bq  (z  )lAp  (z ) .  By  definition,  for 
n  <  0,  i  [n  ]  =  0,  and  for  0  s  n  ^  q : 

^  g[n~k]ak  =  bk  (D.l) 

k  =0 

But  the  MEM (p,q )  solution  must  also  satisfy  equation  (3.13): 


2  g fa-*]**  =  bk  for  n  =0,  ...  ,q  (D.2) 

k  =0 

Define  eg [n]  =  g [n]-g[n],  subtract  (D.l)  from  (D.2),  and  write  the  result  in  matrix 
form: 


1 

1 


(D.3) 


al 


[9] 


V  / 

Since  the  matrix  is  invertible,  *g  [n  ]  =  0  for  n  =  0,  .  .  .  ,  q  and  the  first  q  + 1  coeffi¬ 
cients  of  the  model  impulse  response  must  match  the  given  data,  g  [n  ]  =  g  [n  ]. 


Proof  of  4B) 


Let  Pp  q(z)  be  the  model  power  spectrum: 


Pp,q(z)  =  f 


Bq(z 

Ap(z)Ap\llz') 


(D.4) 
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and  let  R  [it  ]  be  the  correlations  of  the  model.  R  [n]  is  just  the  inverse  z-transform  of 
Pp  q(z).  Multiplying  through  by  Ap(z),  identifying  the  leading  coefficients  of 
B*{\lz*)IA*(\lz')  as  g *[-n],  and  equating  terms  with  equal  powers  of  z,  we  get  an 
equation  similar  in  form  to  (3.16): 

Rpd  =  y2dPtt£  (D.5) 

where  correlation  matrix  has  (n  ,m  ),h  element  /?[m  -n],  and  Gpq  has  (n  ,m  ),h 

element  g [q  -p  +m  -it].  But  the  MEM(p,q )  model  y,  d,  b.  must  also  satisfy  equa¬ 
tion  (3.16).  Recognizing  that  g[n)  =  g[n]  for  n  ^ q ,  and  thus  Gpq  =  Gp  q ,  sub¬ 
tracting  (D.5)  from  (3.16)  gives: 


e*[°]  **  «*[p] 

•  •  • 


(V 

• 

.  1 

=  Q 


(D.6) 


where  *R [n  ]  =  R  [n  ] -R  [n  ]  is  the  correlation  matching  error.  To  analyze  these  error 
values,  let  us  extrapolate  the  tails  of  «/*[»]  by  forward  and  backward  prediction  using 
the  pole  polynomial  coefficients  ak : 


«*[»  1=  '•*  -optR[n-p]  for  n  =  p  +  l,p  +2,...  (D.7) 

«*[»]=  +  •“  ~aptR[n+p]  for  it  =  -p -1,-p -2,... 

Using  (D.6)  and  (D.7)  we  can  show  that: 

f)  i  =  0  for  all  n  (D.8) 

/  =o  *  =o 

If  the  polynomial  Ap  (z )  has  all  its  roots  strictly  inside  the  unit  circle,  then  the  extrapo¬ 
lated  sequence  *R  [n]  will  decay  exponentially  to  zero,  and  thus  must  be  absolutely 


summable.  Therefore  the  Fourier  transform  *R  (e^)  exists  and  is  finite  for  all  to. 
Fourier  Transforming  (D.8)  gives: 

I  V''“>  I2  =  0  (D'9) 

Since  all  of  the  roots  of  A.  (z)  are  strictly  inside  the  unit  circle,  I  Ap  (r;iu)  i 2  >  0  for 
all  <o.  Thus  we  must  have  (e^m)  =  0  for  all  <o,  which  implies  that  [n  ]  =  0  for  all 
n .  Thus  the  model  must  match  the  given  correlations. 

Proof  of  5A) 

If  Ap(z)  has  one  or  more  roots  pt  on  the  unit  circle,  pj  =  e^‘ ,  then  the  argu¬ 
ment  above  breaks  down.  First  of  all.  Theorem  3  guarantees  that  any  poles  on  the 
unit  circle  must  be  canceled  by  matching  zeroes.  Thus  the  power  spectrum  of  the 
model  has  a  Region  Of  Convergence  which  includes  the  unit  circle,  and  the  correla¬ 
tions  are  still  well-defined.  The  extrapolated  tails  of  tR  [n],  however,  may  oscillate  at 
the  frequencies  of  the  roots  on  the  unit  circle.  Suppose  that  the  roots  of  Ap  (z )  are  all 
different.  Then  the  tails  of  eg[n]  will  be  bounded,  but  may  not  be  absolutely  summ¬ 
able.  The  Fourier  Transform  e*  may  therefore  contain  impulses.  In  particular, 
since  |  Ap(e^ia)  |  2  =  0  at  the  frequencies  u  =  w* ,  a  non-zero  solution  for  eR(eJ<*)  in 

j  (|) . 

(D.9)  will  be  a  weighted  sum  of  impulses  located  at  the  unit  circle  roots  e  1 : 

i 

for  some  set  of  weights  p.( .  Inverse  Transforming: 


(D.10) 


(D.ll) 


=  2  M-/  p" 

/ 

More  generally,  if  Ap  (z )  has  roots  pi  with  multiplicities  sit  then  tR  (e^m)  will  be  a 
weighted  sum  of  impulses  and  impulse  derivatives  of  order  up  to  -1.  Since  eA[n] 
is  even,  however,  only  even  order  derivatives  can  be  present.  Thus  tR  (e^u)  will  con¬ 
tain  terms  such  as  p  /*,  n2  p/*,...,  n*'  2p/*. 

Proof  of  4C) 

If  G(z)  =  Bq(z)/Ap(z)  is  strictly  minimum  phase,  then  by  Theorem  4A,  its 
impulse  response  has  the  correct  first  q  + 1  coefficients g [0],  .  .  .  ,g[q ].  Rewriting  the 
recursive  equation  (3.10)  to  define  the  cepstra  in  terms  of  the  minimum  phase  coeffi¬ 
cients: 

c[n]  “  “  2  **[*]*[»“*!  (D.12) 

n  k  =0 

This  equation  provides  a  one-to-one  mapping  between  the  #[0],  .  .  .  ,#[<?]  and 

c  [1] . c  [<7  ].  Thus  the  first  q  + 1  cepstral  coefficients  c  [n  ]  must  equal  the  specified 

values  c [1],  .  .  .  ,c[q\.  Furthermore,  the  entropy  of  the  model  is  logy*.  By  Theorem 
1,  however,  this  is  the  maximum  possible  entropy  of  any  model  matching  the  given 
data.  Therefore,  the  MEM  (p  ,q)  solution  is  indeed  the  solution  to  the  constrained 
Maximum  Entropy  problem.  By  Theorem  2,  it  is  unique  except  for  possible  pole-zero 
cancellation  if  the  multiplicity  of  is  greater  than  1. 


Proof  of  5B) 


Suppose  that  Bq  (z )  is  not  minimum  phase,  so  that  it  has  zeroes  strictly  outside  the 
unit  circle.  Suppose  there  are  q0  zeroes  outside  the  unit  circle,  pq,  .  .  .  ,  PqQ-i>  and 


q  -qQ  zeroes  inside,  p^, 


P9-l- 


bm) =  n  (i_p iz  !)  *n  (i_p jz  *) 


(D.13) 


To  calculate  the  model  cepstral  values  c[l],  .  .  .  , c[g],  we  need  to  factor  the  model 
power  spectrum  into  a  gain  times  a  minimum  phase  factor  times  a  maximum  phase  fac¬ 
tor.  In  this  case  we  get: 


r.A*)  = 


0,(0  p,  (i/» ) 

V2) 


(DM) 


where: 


q°~x  1  i  «_l  i 

p,(o=  n  a— V*-1)  n  (i-pj2'1) 

i  =0  Pi  j  =q0 

y2  *  y2  n  I  Pi  1 2 


(D.15) 


The  impulse  response  g'[n]  of  this  minimum  phase  factor  D  (z  )/A  (z )  satisfies: 


g'l  0] 


' 


(D.16) 


aq  *  <*i  1  ^ -S' 


where  the  dk  are  the  coefficients  of  D  (z ).  Subtracting  this  from  (D.2): 
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where  the  strict  inequality  holds  because  pq»  .  .  .  >P9o-i  are  outside  the  unit  circle, 

and  thus  have  magnitudes  greater  than  one.  The  MEM (p,q)  model  does  not  solve  the 
original  Maximum  Entropy  problem.  We  conclude  that  there  does  not  exist  any 
strictly  positive,  finite  power  spectrum  P(ejw)  meeting  the  Paley-Wiener  condition 
which  matches  the  given  correlations  and  cepstra,  and  which  achieves  the  maximum 
possible  entropy. 


Appendix  E  -  Derivation  of  Fast  Levinson-Style  Algorithm 


Assume  that  we  already  have  the  values  an~\,  dn-\,  /*-i  and  Now 


using  (4.2): 


K^(n  ,n+q-p) 


an  —  l,n  -1 

a\,n-\ 

1 


- 


«n-l 


n  — 1  _ 

where  <!>„  =  2  R(n  ~k)ak,n-l 
k=0 


*«  =  2  gli-p+*-kY*k+-i 
*  =  o 


Using  (4.3): 


g  fe-p] 


g  [ n+q-p ) 


(E.l) 


'  <*0,n-l  [lj  0 

K,(».»^-p)  -  :  +  0 

0  0 

where  [  ^(n, n+q-p) 


/  0,11  -1 


g*[q-p] 


K^in, n+q-p)  f  = 

o  U’[*+*-p] 


(E.2) 


(E.3) 


H  —  1  r  1  • 

where  M-*  =  £  K Jn, n+q-p)  ,fk,n- 1  ~  g  [n+g~p] 

*=o  l  J  n* 

We  can  now  compute  a„  by  combining  linear  multiples  of  dn  _i  and  /„  _  j,  then  com¬ 
pute  dn  by  combining  linear  multiples  of  dn  _  j  and  an ,  and  then  compute  by  com¬ 
bining  linear  multiples  of  f*  _j  and  an .  The  value  of  e„  can  be  found  by  direct  substi- 


tution.  These  update  formulas  are  given  in  the  algorithm  listing  in  section  5. 

To  compute  M„  recursively,  we  start  by  expanding  dn  and  in  terms  of  dn  _j, 
fjt-l  and  a„: 


dn* 1  0 


v*aH 

vn 


(E.4) 


M”  =  [  (/»H-i  oj  -  17  l^HJ  J  ’  (E4) 

( W]  “  ^  (  v»a"  W') 

Writing  this  out  as  a  sum  of  four  terms,  and  using  (4.2)  and  (5.3),  gives  the  update 
formula  for  M„  in  section  S.  Finally,  we  can  avoid  having  to  calculate  vn  and  jjlw 


directly  by  noting  that: 


dn*l  0 


,H  n  K1(r'’n+q-p)a„n  = 

Ln  ~\ 


dn*  1  0 
/nH-l  0 


(E.5) 


f  °) 
=  i°J 


But  also: 


dS- 1  0 


fJi_x  o  ^i(nfn+q-p)a^ 


(E.6) 


dn*  1  0 

£**- 1  0 


K Jn  ,n+q-p) 


U-,1 


dn—  1  (jx  - 1 
0  0 


y2^n 


/  »  *  w  *  *  * 


[sfa-p]  '  •  •  g[n+q-p]+  M.„  J  "  1  ^724»« 


*n  + 


+  M 


Equating  these  gives  the  formula  relating  vR ,  *jl„  to  4>„ ,  4>„  given  in  section  5.  Thus 
we  can  derive  v„  and  \i.n  from  this  formula,  rather  than  calculating  them  directly. 
This  is  particularly  advantageous,  since  we  no  longer  need  to  calculate  the  actual  ele¬ 
ments  of  the  Ky(n  ,n+q-p  )  matrix. 


Appendix  F  -  Proof  of  Theorem  6 


The  proof  of  this  theorem  relies  on  two  key  formulas  involving  the  en .  The  first 
formula  is: 

«o  0 

AHiey(p,<7)A=  (F.l) 

0 

1  *u  •  apt 
1 

where  A  *  •  «„ 

.0  l 

To  prove  this  formula,  note  that  for  n  £  m ,  the  (n  )!ft  element  of  the  matrix  on  the 
left  is: 

Uhqh)  (**,}  UHuH)  rF2x 

*  '  K,(P,q)  [  q  J  =  '  '  K^im/n+q-p)  am  (R2) 

0 

0 
•m 

(e„  if  n  =  m 

0  if  n  <  m 

where  we  used  (4.2)  in  the  first  line  and  (S.3)  in  the  second.  The  case  n  >m  can  be 
treated  similarly.  We  now  invoke  Sylvester’s  law  of  inertia[18, 19]  which  states  that 
since  A  is  invertible,  the  matrix  AH  Ky(p  ,q)  A  must  have  the  same  number  of  positive 
eigenvalues  as  K~(p,q),  the  same  number  of  negative  eigenvalues,  and  the  same 


number  of  zero  eigenvalues.  The  eigenvalues  of  AnK1(p,q)A,  however,  are 
tQ,  ...  ,tp.  Thus,  using  Lemma  1,  if  0  s  y2  <  then  K ^(p  ,q)  >  0  and  all  the 

values  of  e„  must  be  strictly  positive.  If  y2>  l/X^,  then  K. y(p  ,q)  has  at  least  one 
negative  eigenvalue,  so  at  least  one  value  e„  is  negative.  If  y2  =  l/XmaT,  then 
K ^(p  ,q )  s  0,  it  will  have  r  eigenvalues  equal  to  zero,  and  thus  r  of  the  values  will 
be  zero  and  the  rest  will  be  strictly  positive. 

To  prove  that  the  sequence  of  tn  is  decreasing  for  K^(p  ,q)  2s 0,  we  combine  the 
formulas  for  en  and  vn ,  in  the  Mullis-Roberts  algorithm  in  section  5: 


=  «„-l  -  '  -y1 


(F.3) 


€n  —  1 


(  4>*  ) 


-y2*n 


-  *u2 


y2 1  «1»,  I2 


Now  if  then  by  (4.2)  K^n  -l,n  -1  +  q  -p)  s  0,  and  by  (5.6), 

M„  _!  ss  0  also.  Thus  «„  ^  _i  for  all  n .  For  y2  s  l/X^,  therefore,  the  values  of 
e  decrease  with  n ,  and  are  non-negative.  For  y2  =  l/XmaT,  the  last  r  values  must  be 


zero. 


For  K^(p  ,q)  >  0,  all  the  e„  will  be  strictly  positive,  and  therefore  we  can  run  the 
Mullis-Roberts  algorithm  for  steps  n  =  0,  p  to  calculate  all  the  a* ,  «n  •  If 
K-y(P »?)  a  0  wfth  811  r  dimensional  null  space,  then  e„  >  0  for  n  -  0,  .  .  .  ,p -r  but 
tip  _r  +  1  =  0.  Thus  we  must  stop  the  Mullis-Roberts  algorithm  at  step  n  =  p  -r  + 1. 
However,  Theorem  2  guarantees  that  the  vectors  a„  =  ( 0  1  )T  for 


A 


n  =  p  -r  +2,  .  .  .  ,p  will  be  elements  of  the  null  space  of  K^(n  ,n  +q  -p),  and  thus 
satisfy  the  assumptions  of  this  theorem  with  «„  =  0  for  n  =  p  -r  +2,  .  .  .  ,p. 


Appendix  G  -  Proof  of  Theorem  7 

Let  0  s  *y2  <  l/XrnaT.  By  Theorem  6,  the  Mullis-Roberts  algorithm  will  be  able  to 
calculate  the  unique  pth  order  solution  Op ,  ep . 

Appendix  C  proves  that  for  y2<y2,  the  pole  polynomial  solution  to  (5.1)  is 
strictly  stable  with  all  poles  strictly  inside  the  unit  circle.  To  prove  part  B  of  Theorem 
7,  let  us  define  the  zero  polynomial  #9(z)  by  substituting  the  coefficients  of  Ap{z)  into 
the  recursion  (3.13).  Following  the  proof  in  Appendix  D,  we  can  show  that  the  lead¬ 
ing  coefficients  of  the  impulse  response  of  our  model  exactly  match  the  given  values 
;[0],  ....$[?].  The  correlation  matching  proof  proceeds  similarly  to  that  in  Appen¬ 
dix  D,  except  that  we  find  that: 


«*[°I  •'  «j?CpJ 

•  •  • 

•  • 

«*['/>]  ”  «*[0J 


(G.l) 


We  can  extrapolate  [n]  forwards  and  backwards  as  in  Appendix  D.  We  can  then 


show  that: 

f  f)  «j? [n-k+l]d[* dk  =  e_8[nj  for  all  n 
/=  0*=0 


(0.2) 


Since  the  pole  polynomial  is  guaranteed  by  part  A  to  be  strictly  stable,  the  extrapolated 
tails  decay  exponentially  and  are  thus  absolutely  summable.  We  can  thus  Fourier 


Transform  both  sides  of  (G.2)  to  give: 


«*(*'")  v*y“)  = 


(G.3) 


'‘A  •  i  >  '  ;  1  ■■  «•  v  ■  v 
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Appendix  H  -  High  Order  Minimum  Phase  AR  Approximations  to  Non-Minimum 
Phase  Models 

To  find  a  minimum  phase  all-pole  (AR)  model  €N/AN(z)A^(l/z*)  which  matches 
R  [0]  and  c  [1]  and  which  achieves  maximum  entropy  for  a  given  model  order,  we  will 
try  searching  over  the  reflection  coefficient  domain.  The  reflection  coefficients 
k\,  ...  ,kN  are  related  to  an  Nth  order  polynomial  AN(z )  by  a  Levinson  recursion, 
which  builds  an  n'*  order  polynomial  An(z)  from  linear  combinations  of  the  n-  1th 
order  polynomial  An  _i(z): 

A„(z)  =  An_x{z)  +  knz ~nA*_i(l/z *)  for  n  =  1 - ,N  (H.l) 

where  A0(z)  =  1.  Simply  by  keeping  all  kH  less  than  1  in  magnitude,  we  can  force  the 

all-pole  model  to  be  minimum  phase.  Furthermore,  if  we  choose  the  model  gain  eN  by 

the  following  recursive  calculation: 

<0=*[°]  (H-2) 

«*  =  ‘b-iC1-  l*»  I2) 

then  each  nth  order  model  eR/AR(z)AR*(l/z*)  will  have  the  zeroth  correlation  coeffi¬ 
cient  equal  to  R  [0].  The  model  entropy  will  be: 

H  =  loge*  =  log/? [0]  +  2  log(l  -  |*R  |2)  (H.3) 

n  *1 

Furthermore,  the  first  cepstral  coefficient  will  be 

c[l]  =  ~a\/f  =  "(*i  +  2  *«*«*- 1  )  (H.4) 

H -2 

The  Maximum  Entropy  Nth  order  all-pole  model  matching  R  [0]  and  c  [1]  can  now  be 
found  by  maximizing  (H.3)  subject  to  the  constraint  (H.4)  and  subject  to  |£R  |  <  1. 


ft 


Introducing  Lagrange  multiplier  X  for  constraint  (H.4),  building  the  Lagrangian,  and 


then  differentiating  the  find  the  critical  point,  we  can  show  that  the  reflection  coeffi¬ 


cients  achieving  maximum  entropy  while  meeting  the  constraints  must  satisfy:  (we 


assume  all  quantities  are  real-valued) 


T77  “  +  */+ 1) 


(H.5) 


where  we  define  Jtg  =  1  and  kN+i  =  0.  An  exact  AR  (=»)  solution  for  c[l]  =  hj  =  -1 


is  X  =  1  and  kn  =  l/(n  +  1).  For  finite  order  AR  models  with  c[l]  somewhat  more 


negative  than  -1,  the  solution  will  generally  have  X  slightly  greater  than  1,  and  kn 


slightly  greater  than  l/(n  +  1).  A  relaxation  algorithm  based  on  equation  (H.5)  can  be 


used  to  find  these  values.  Choose  an  appropriate  AR  model  order  N,  try  an  initial 


guess  of  kn  =  l/(n  + 1),  and  guess  an  initial  value  for  X.  Now  repeatedly  solve  (H.5) 


for  kn  for  n  =  1,  .  .  .  ,N.  Iterate  until  the  estimates  converge.  While  iterating,  com¬ 


pute  the  value  of  c[l]  in  (H.4).  If  this  is  larger  (smaller)  than  desired,  then  decrease 


(increase)  X. 


'  rn  ^ .  *  •  '«  P  .5  •  «/ 
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Appendix  I  -  Asymptotic  Behavior  of  Correlations,  Cepstral  Estimates 

To  estimate  the  asymptotic  behavior  of  the  correlations  and  cepstral  estimates 
derived  from  a  periodogram,  let  us  approximate  the  complex-valued,  zero-mean  Gaus¬ 
sian  sequence  x  [n  ]  as  periodic  with  period  N .  As  N  -  « ,  we  would  expect  this  approx¬ 
imation  to  give  asymptotically  valid  formulas  for  the  means  and  variances  of  estimators 
based  on  the  data.  (See,  for  example[25]  or  [26]).  The  power  spectrum  will  be  a  line 
spectrum: 

JV-1  2ttk 

=  2  )*(«“"*)  for  <*k  =  -TT  (L1) 

*=0  " 

and  the  periodic  correlations  and  cepstra  are  defined  by: 

»  1  N  1  itit.lt 

=  77  2  (i-2) 

"  *=0 

,r  ,  1  x 
c[«]  =  T7  2  biPx(*k)e 

"  k- 0 

Let  X  (<o4  )  be  the  Discrete  Fourier  Transform  (DFT)  of  x  [n  ]: 

*(“>*)  =  2  x[n]eJ<*kn  (1.3) 

n  =0 

it  is  easy  to  show  that  the  real  and  imaginary  parts  of  (“*)  zxc  independent 

Gaussian  random  variables  with  zero  mean  and  variance  HPx(<ok): 

E  -^jfRcX(ak)  =  E  ^^ImX(<o4)  =0  (1.4) 

E  —  ReX(o)Jk)  Re X (toy)  =E  ImX((i>4)  ImX(<i>/) 


2Px^k)  for  *=° 

0  else 


E  jjRcX(<ak)  Im^o,)  =0 


Given  x[n],  let  us  estimate  the  power  as  equal  to  the  periodogram, 

Px(<»k)  =  ^  I X(<ok)  |2.  Inverse  transform  the  periodogram  and  the  log  periodogram 

to  get  the  correlation  and  cepstral  estimates  R  [n]  and  c[n]  reflectively.  To  compute 
the  mean  and  variance  of  the  estimates,  we  will  use  the  formula: 

Cov[u  ,  v  ]  =  E[uv  ]  -  E[u  ]E[v  J  (1.6) 

and  where x,y,z  and  w  are  Gaussian: 


E[xyr>v  ]  =  E[xy  ]  E[zw  ]  +  E[xz  ]E[yw  ]  +  E[xw  ]E[yz  ] 


Then: 


E[*M]  =  W  E  j;  l*(“»)l2  «y“‘"=R[n]  (1.7) 

Cov [*[«],*[/»]]  =  21cOT.fi|jf(«.t)|Ji|x(0),)|Jj  e'K"-“<"> 

1  1  N*  k- 0  i-0  "  N 

-  (I.*) 

N  k~  0 

To  compute  the  statistics  of  the  cepstral  estimates,  we  will  use  formulas  from  [27]: 


E[C'("']  =  W%E 


TT  2  log/’jfaj)  +  f  e  'lop*  eJ“* 
n  *-o  0 


=  c[/i]  -  «y8[n] 


]1  N-lN-l  1  „  i 

=  3  2  2  Cov  Iog77  l^(“*)|2»logT7  l^(“/)|2 
N  k=0  1=0  N  N 

s'  (/."( » 

N *  k=0  o  V  ' 

'  j 


577  is  Euler’s  constant. 
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